[Python-3000] Unicode IDs -- why NFC? Why allow ligatures?

Rauli Ruohonen rauli.ruohonen at gmail.com
Tue Jun 5 13:06:53 CEST 2007


On 6/5/07, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> I'd love to get rid of full-width ASCII and halfwidth kana (via
> compatibility decomposition).

If you do forbid compatibility characters in identifiers, then they
should be flagged as an error, not converted silently. NFC, on the
other hand, should be applied silently. The reason is that character
equivalence is the same thing as binary equivalence of the NFC form in
Unicode, and adding extra equivalences (whether it's "FoO" == "foo",
"カキ" == "カキ" or "A123" == "A123") is surprising.

In short, I would like this function to return 'OK' or be a
syntax error, but it should not fail or return something else:

def test():
    if 'A' == 'A': return 'OK'
    A = 'O'
    A = 'K' # as tested above, 'A' and 'A' are not the same thing
    return locals()['A']+locals()['A']

Note that 'A' == 'A' should be false (no automatic NFKC for strings,
please).


More information about the Python-3000 mailing list