[Python-3000] PEP: Supporting Non-ASCII Identifiers

James Y Knight foom at fuhm.net
Tue May 1 18:58:19 CEST 2007


On May 1, 2007, at 12:19 PM, Jim Jewett wrote:

> On 5/1/07, "Martin v. Löwis" <martin at v.loewis.de> wrote:
>
>> The identifier syntax is <ID_Start> <ID_Continue>\*.
>
>> ID_Start is defined as all characters having one of the general
>> categories uppercase letters (Lu), lowercase letters (Ll), titlecase
>> letters (Lt), modifier letters (Lm), other letters (Lo), letter
>> numbers (Nl), plus the underscore (XXX what are "stability extensions
>> listed in UAX 31).
>
> Are you sure that modifier letters should be included?  The standard
> says so, but as nearly as I can tell, these are really more like
> diacritics -- and some of them look an awful lot like punctuation.
>
>     http://unicode.org/charts/PDF/U02B0.pdf

The entire point of these characters is that they are to be treated  
as letters (that is, can make up part of a word). If they were  
punctuation or diacritics, the other very-similar-looking characters  
in other parts of the codespace could be used. These letters seem to  
be mainly intended for spelling out phonetic pronunciations. It's  
unlikely that anyone would want to write an python identifier in IPA,  
but that's not a good reason to go against the standard.

James


More information about the Python-3000 mailing list