[Python-3000] Unicode IDs -- why NFC? Why allow ligatures?
Stephen J. Turnbull
stephen at xemacs.org
Wed Jun 6 10:26:33 CEST 2007
Rauli Ruohonen writes:
> There are some cases where users might in the future want to make
> a distinction between "compatibility" characters, such as these:
> http://en.wikipedia.org/wiki/Mathematical_alphanumeric_symbols
I don't think they belong in identifiers in a general purpose
programming language, though their usefulness to mathematical printers
is obvious. I think programs should be verbalizable, unlike math
where most of the text is not intended to correspond to any reality,
but is purely syntactic transformation.
> For this reason I think that compatibility transformation, if any,
> should only be applied to characters where there's a practical
> reason to do so, and for other cases punting (=syntax error) is
> safest.
"Banzai Python!" and all that, but even if Python is in use 10,000
years from now, I think compatibility characters will still be a
YAGNI. I admit that's a reasonable compromise, and allows future
extension without gratuitously making existing programs illegal; I
could live with it very easily (but I'd want those full-width ASCII
decomposed :-). I just feel it would be wiser to limit Python
identifiers to NFKC.
> I use two Japanese input methods (MS IME and scim/anthy), but only the
> latter one daily. When I type text that mixes Japanese and other
> For code that uses a lot of Japanese this may not be convenient,
> but then you'd want to set your input method to use ASCII for ASCII
> anyway,
Both of those address the issue of the annoyance of syntax errors in
original code to a great extent, but not in debug/maintenance mode
where you only type a few characters of code at a time, and typically
enter from user mode.
> You have to go out of your way to type halfwidth katakana, and it
> isn't really useful in identifiers IMHO.
I agree, but then I don't work for the Japanese Social Security
Administration.
More information about the Python-3000
mailing list