[Python-checkins] r55399 - peps/trunk/pep-3131.txt
martin.v.loewis
python-checkins at python.org
Thu May 17 11:01:44 CEST 2007
Author: martin.v.loewis
Date: Thu May 17 11:01:43 2007
New Revision: 55399
Modified:
peps/trunk/pep-3131.txt
Log:
Include Other_ID_{Start|Continue}.
Modified: peps/trunk/pep-3131.txt
==============================================================================
--- peps/trunk/pep-3131.txt (original)
+++ peps/trunk/pep-3131.txt Thu May 17 11:01:43 2007
@@ -68,17 +68,19 @@
The identifier syntax is ``<ID_Start> <ID_Continue>*``.
-``ID_Start`` is defined as all characters having one of the general categories
-uppercase letters (Lu), lowercase letters (Ll), titlecase letters (Lt), modifier
-letters (Lm), other letters (Lo), letter numbers (Nl), plus the underscore (XXX
-what are "stability extensions" listed in UAX 31).
-
-``ID_Continue`` is defined as all characters in ``ID_Start``, plus nonspacing
-marks (Mn), spacing combining marks (Mc), decimal number (Nd), and connector
-punctuations (Pc).
+``ID_Start`` is defined as all characters having one of the general
+categories uppercase letters (Lu), lowercase letters (Ll), titlecase
+letters (Lt), modifier letters (Lm), other letters (Lo), letter
+numbers (Nl), the underscore, and characters carrying the
+Other_ID_Start property.
+
+``ID_Continue`` is defined as all characters in ``ID_Start``, plus
+nonspacing marks (Mn), spacing combining marks (Mc), decimal number
+(Nd), connector punctuations (Pc), and characters carryig the
+Other_ID_Continue property.
-All identifiers are converted into the normal form NFC while parsing; comparison
-of identifiers is based on NFC.
+All identifiers are converted into the normal form NFC while parsing;
+comparison of identifiers is based on NFC.
Policy Specification
====================
@@ -97,18 +99,19 @@
The following changes will need to be made to the parser:
-1. If a non-ASCII character is found in the UTF-8 representation of the source
- code, a forward scan is made to find the first ASCII non-identifier character
- (e.g. a space or punctuation character)
-
-2. The entire UTF-8 string is passed to a function to normalize the string to
- NFC, and then verify that it follows the identifier syntax. No such callout
- is made for pure-ASCII identifiers, which continue to be parsed the way they
- are today.
-
-3. If this specification is implemented for 2.x, reflective libraries (such as
- pydoc) must be verified to continue to work when Unicode strings appear in
- ``__dict__`` slots as keys.
+1. If a non-ASCII character is found in the UTF-8 representation of
+ the source code, a forward scan is made to find the first ASCII
+ non-identifier character (e.g. a space or punctuation character)
+
+2. The entire UTF-8 string is passed to a function to normalize the
+ string to NFC, and then verify that it follows the identifier
+ syntax. No such callout is made for pure-ASCII identifiers, which
+ continue to be parsed the way they are today. The Unicode database
+ must start including the Other_ID_{Start|Continue} property.
+
+3. If this specification is implemented for 2.x, reflective libraries
+ (such as pydoc) must be verified to continue to work when Unicode
+ strings appear in ``__dict__`` slots as keys.
References
==========
More information about the Python-checkins
mailing list