[Python-3000] PEP 3131 roundup

"Martin v. Löwis" martin at v.loewis.de
Wed Jun 6 07:15:21 CEST 2007


> I think "obvious" referred to the reasoning, not the outcome.
> 
> I can tell that the decision was "NFC, anything goes", but I don't see why.

I think I'm repeating myself: Because UAX 31 says so. That's it. There
is a standard that experts in the domain have specified, and PEP 3131
follows it. Following standards is a good thing, deviating from them
is a bad thing.

> (2)
> I cannot understand why ID_START/CONTINUE was chosen instead of the
> newer and more recommended XID_START/CONTINUE.  From UAX31 section 2:
> """
> The XID_Start and XID_Continue properties are improved lexical classes
> that incorporate the changes described in Section 5.1, NFKC
> Modifications. They are recommended for most purposes, especially for
> security, over the original ID_Start and ID_Continue properties.
> """

Right. I read it that these should be used when 5.1 is considered
in the language. This, in turn, should be used when the
normalization form is NFKC:

"""
Where programming languages are using NFKC to fold differences between
characters, they need the following modifications of the identifier
syntax from the Unicode Standard to deal with the idiosyncrasies of a
small number of characters. These modifications are reflected in the
XID_Start and XID_Continue properties.
"""

As the PEP does not use NFKC (currently), it should not use XID_Start
and XID_Continue either.

> Nor can I understand why the additional restrictions in
> xidmodifications (from TR39) were ignored. 

Consideration of UTR 39 is listed as an open issue. One problem
with it is that using it would restrict the language over time,
so that previously correct programs might not be correct anymore
in a future version. So using it might break backwards
compatibility.

Regards,
Martin



More information about the Python-3000 mailing list