[Python-checkins] cpython (merge 3.3 -> 3.3): merge
georg.brandl
python-checkins at python.org
Sun Oct 27 07:38:52 CET 2013
http://hg.python.org/cpython/rev/49be8a5c9de2
changeset: 86680:49be8a5c9de2
branch: 3.3
parent: 86679:e445d02e5306
parent: 86672:220e3e40d176
user: Georg Brandl <georg at python.org>
date: Sun Oct 27 07:39:36 2013 +0100
summary:
merge
files:
Lib/sre_compile.py | 10 +++++-----
Modules/_sre.c | 3 +--
2 files changed, 6 insertions(+), 7 deletions(-)
diff --git a/Lib/sre_compile.py b/Lib/sre_compile.py
--- a/Lib/sre_compile.py
+++ b/Lib/sre_compile.py
@@ -276,10 +276,10 @@
# set is constructed. Then, this bitmap is sliced into chunks of 256
# characters, duplicate chunks are eliminated, and each chunk is
# given a number. In the compiled expression, the charset is
-# represented by a 16-bit word sequence, consisting of one word for
-# the number of different chunks, a sequence of 256 bytes (128 words)
+# represented by a 32-bit word sequence, consisting of one word for
+# the number of different chunks, a sequence of 256 bytes (64 words)
# of chunk numbers indexed by their original chunk position, and a
-# sequence of chunks (16 words each).
+# sequence of 256-bit chunks (8 words each).
# Compression is normally good: in a typical charset, large ranges of
# Unicode will be either completely excluded (e.g. if only cyrillic
@@ -292,9 +292,9 @@
# less significant byte is a bit index in the chunk (just like the
# CHARSET matching).
-# In UCS-4 mode, the BIGCHARSET opcode still supports only subsets
+# The BIGCHARSET opcode still supports only subsets
# of the basic multilingual plane; an efficient representation
-# for all of UTF-16 has not yet been developed. This means,
+# for all of Unicode has not yet been developed. This means,
# in particular, that negated charsets cannot be represented as
# bigcharsets.
diff --git a/Modules/_sre.c b/Modules/_sre.c
--- a/Modules/_sre.c
+++ b/Modules/_sre.c
@@ -2749,8 +2749,7 @@
\_________\_____/ /
\____________/
- It also helps that SRE_CODE is always an unsigned type, either 2 bytes or 4
- bytes wide (the latter if Python is compiled for "wide" unicode support).
+ It also helps that SRE_CODE is always an unsigned type.
*/
/* Defining this one enables tracing of the validator */
--
Repository URL: http://hg.python.org/cpython
More information about the Python-checkins
mailing list