[Python-checkins] python/nondist/sandbox/string alt292.py, 1.2,
1.3 curry292.py, 1.2, 1.3
rhettinger at users.sourceforge.net
rhettinger at users.sourceforge.net
Tue Sep 7 06:41:58 CEST 2004
Update of /cvsroot/python/python/nondist/sandbox/string
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv31196
Modified Files:
alt292.py curry292.py
Log Message:
* Adopted Martin's suggestion for trapping end user placeholder name errors.
Now depends on a Unicode definitions of alphanumeric rather than locale
specific definitions. The resulting code is cleaner and will run the
same across all platforms and locale settings.
* Reformatted the comments in the doctests.
Index: alt292.py
===================================================================
RCS file: /cvsroot/python/python/nondist/sandbox/string/alt292.py,v
retrieving revision 1.2
retrieving revision 1.3
diff -u -d -r1.2 -r1.3
--- alt292.py 7 Sep 2004 04:26:43 -0000 1.2
+++ alt292.py 7 Sep 2004 04:41:54 -0000 1.3
@@ -8,7 +8,9 @@
'the 10 and'
-Next, it makes sure the return type is a str if all the inputs are a str. Any unicode components will cause a unicode output. This matches the behavior of other re and string ops:
+Next, it makes sure the return type is a str if all the inputs are a str. Any
+unicode components will cause a unicode output. This matches the behavior of
+other re and string ops:
>>> dollarsub('the $xxx and', xxx='10')
'the 10 and'
@@ -28,7 +30,8 @@
u'the 10 and'
-The ValueErrors are now more specific. They include the line number and the mismatched token:
+The ValueErrors are now more specific. They include the line number and the
+mismatched token:
>>> t = """line one
... line two
@@ -40,18 +43,30 @@
ValueError: Invalid placeholder on line 3: '@malformed'
-Also, the re pattern was changed just a bit to catch an important class of locale specific errors where a user may use a non-ASCII identifier. The previous implementation would match up to the first non-ASCII character and then return a KeyError if the abbreviated is (hopefully) found. Now, it returns a value error highlighting the problem identifier. Note, we still only accept Python identifiers but have improved error detection:
+Also, the re pattern was changed just a bit to catch an important class of
+language specific errors where a user may use a non-ASCII identifier. The
+previous implementation would match up to the first non-ASCII character and
+then return a KeyError if the abbreviated is (hopefully) found. Now, it
+returns a value error highlighting the problem identifier. Note, we still
+only accept Python identifiers but have improved error detection:
->>> import locale
->>> savloc = locale.setlocale(locale.LC_ALL)
->>> _ = locale.setlocale(locale.LC_ALL, 'spanish')
>>> t = u'Returning $ma\u00F1ana or later.'
>>> dollarsub(t, {})
Traceback (most recent call last):
. . .
ValueError: Invalid placeholder on line 1: u'ma\xf1ana'
->>> _ = locale.setlocale(locale.LC_ALL, savloc)
+
+Exercise safe substitution:
+
+>>> safedollarsub('$$ $name ${rank}', name='Guido', rank='BDFL')
+'$ Guido BDFL'
+>>> safedollarsub('$$ $name ${rank}')
+'$ $name ${rank}'
+>>> safedollarsub('$$ $@malformed ${rank}')
+Traceback (most recent call last):
+ . . .
+ValueError: Invalid placeholder on line 1: '@malformed'
'''
@@ -65,11 +80,11 @@
\$([_a-z][_a-z0-9]*(?!\w))| # $ and a Python identifier
\${([_a-z][_a-z0-9]*)}| # $ and a brace delimited identifier
\$(\S*) # Catchall for ill-formed $ expressions
-""", _re.IGNORECASE | _re.VERBOSE | _re.LOCALE)
+""", _re.IGNORECASE | _re.VERBOSE | _re.UNICODE)
# Pattern notes:
#
# The pattern for $identifier includes a negative lookahead assertion
-# to make sure that the identifier is not followed by a locale specific
+# to make sure that the identifier is not followed by a Unicode
# alphanumeric character other than [_a-z0-9]. The idea is to make sure
# not to partially match an ill-formed identifiers containing characters
# from other alphabets. Without the assertion the Spanish word for
Index: curry292.py
===================================================================
RCS file: /cvsroot/python/python/nondist/sandbox/string/curry292.py,v
retrieving revision 1.2
retrieving revision 1.3
diff -u -d -r1.2 -r1.3
--- curry292.py 7 Sep 2004 04:26:44 -0000 1.2
+++ curry292.py 7 Sep 2004 04:41:54 -0000 1.3
@@ -8,7 +8,9 @@
'the 10 and'
-Next, it makes sure the return type is a str if all the inputs are a str. Any unicode components will cause a unicode output. This matches the behavior of other re and string ops:
+Next, it makes sure the return type is a str if all the inputs are a str. Any
+unicode components will cause a unicode output. This matches the behavior of
+other re and string ops:
>>> Template('the $xxx and')(xxx='10')
'the 10 and'
@@ -28,7 +30,8 @@
u'the 10 and'
-The ValueErrors are now more specific. They include the line number and the mismatched token:
+The ValueErrors are now more specific. They include the line number and the
+mismatched token:
>>> t = """line one
... line two
@@ -40,18 +43,19 @@
ValueError: Invalid placeholder on line 3: '@malformed'
-Also, the re pattern was changed just a bit to catch an important class of locale specific errors where a user may use a non-ASCII identifier. The previous implementation would match up to the first non-ASCII character and then return a KeyError if the abbreviated is (hopefully) found. Now, it returns a value error highlighting the problem identifier. Note, we still only accept Python identifiers but have improved error detection:
+Also, the re pattern was changed just a bit to catch an important class of
+language specific errors where a user may use a non-ASCII identifier. The
+previous implementation would match up to the first non-ASCII character and
+then return a KeyError if the abbreviated is (hopefully) found. Now, it
+returns a value error highlighting the problem identifier. Note, we still
+only accept Python identifiers but have improved error detection:
->>> import locale
->>> savloc = locale.setlocale(locale.LC_ALL)
->>> _ = locale.setlocale(locale.LC_ALL, 'spanish')
>>> t = u'Returning $ma\u00F1ana or later.'
>>> Template(t)({})
Traceback (most recent call last):
. . .
ValueError: Invalid placeholder on line 1: u'ma\xf1ana'
->>> _ = locale.setlocale(locale.LC_ALL, savloc)
Exercise safe substitution:
@@ -80,11 +84,11 @@
\$([_a-z][_a-z0-9]*(?!\w))| # $ and a Python identifier
\${([_a-z][_a-z0-9]*)}| # $ and a brace delimited identifier
\$(\S*) # Catchall for ill-formed $ expressions
- """, _re.IGNORECASE | _re.VERBOSE | _re.LOCALE)
+ """, _re.IGNORECASE | _re.VERBOSE | _re.UNICODE)
# Pattern notes:
#
# The pattern for $identifier includes a negative lookahead assertion
- # to make sure that the identifier is not followed by a locale specific
+ # to make sure that the identifier is not followed by a Unicode
# alphanumeric character other than [_a-z0-9]. The idea is to make sure
# not to partially match an ill-formed identifiers containing characters
# from other alphabets. Without the assertion the Spanish word for
More information about the Python-checkins
mailing list