[Python-checkins] CVS: python/nondist/peps pep-0215.txt,1.2,1.3
Ka-Ping Yee
ping@users.sourceforge.net
Tue, 30 Jan 2001 09:09:55 -0800
Update of /cvsroot/python/python/nondist/peps
In directory usw-pr-cvs1:/tmp/cvs-serv312
Modified Files:
pep-0215.txt
Log Message:
Initial draft of string interpolation PEP.
Index: pep-0215.txt
===================================================================
RCS file: /cvsroot/python/python/nondist/peps/pep-0215.txt,v
retrieving revision 1.2
retrieving revision 1.3
diff -C2 -r1.2 -r1.3
*** pep-0215.txt 2000/08/23 06:04:33 1.2
--- pep-0215.txt 2001/01/30 17:09:53 1.3
***************
*** 9,12 ****
--- 9,137 ----
Post-History:
+ Abstract
+
+ This document proposes a string interpolation feature for Python
+ to allow easier string formatting. The suggested syntax change
+ is the introduction of a '$' prefix that triggers the special
+ interpretation of the '$' character within a string, in a manner
+ reminiscent to the variable interpolation found in Unix shells,
+ awk, Perl, or Tcl.
+
+
+ Copyright
+
+ This document is in the public domain.
+
+
+ Specification
+
+ Strings may be preceded with a '$' prefix that comes before the
+ leading single or double quotation mark (or triplet) and before
+ any of the other string prefixes ('r' or 'u'). Such a string is
+ processed for interpolation after the normal interpretation of
+ backslash-escapes in its contents. The processing occurs just
+ before the string is pushed onto the value stack, each time the
+ string is pushed. In short, Python behaves exactly as if '$'
+ were a unary operator applied to the string. The operation
+ performed is as follows:
+
+ The string is scanned from start to end for the '$' character
+ (\x24 in 8-bit strings or \u0024 in Unicode strings). If there
+ are no '$' characters present, the string is returned unchanged.
+
+ Any '$' found in the string, followed by one of the two kinds of
+ expressions described below, is replaced with the value of the
+ expression as evaluated in the current namespaces. The value is
+ converted with str() if the containing string is an 8-bit string,
+ or with unicode() if it is a Unicode string.
+
+ 1. A Python identifier optionally followed by any number of
+ trailers, where a trailer consists of:
+ - a dot and an identifier,
+ - an expression enclosed in square brackets, or
+ - an argument list enclosed in parentheses
+ (This is exactly the pattern expressed in the Python grammar
+ by "NAME trailer*", using the definitions in Grammar/Grammar.)
+
+ 2. Any complete Python expression enclosed in curly braces.
+
+ Two dollar-signs ("$$") are replaced with a single "$".
+
+
+ Examples
+
+ Here is an example of an interactive session exhibiting the
+ expected behaviour of this feature.
+
+ >>> a, b = 5, 6
+ >>> print $'a = $a, b = $b'
+ a = 5, b = 6
+ >>> $u'uni${a}ode'
+ u'uni5ode'
+ >>> print $'\$a'
+ 5
+ >>> print $r'\$a'
+ \5
+ >>> print $'$$$a.$b'
+ $5.6
+ >>> print $'a + b = ${a + b}'
+ a + b = 11
+ >>> import sys
+ >>> print $'References to $a: $sys.getrefcount(a)'
+ References to 5: 15
+ >>> print $"sys = $sys, sys = $sys.modules['sys']"
+ sys = <module 'sys' (built-in)>, sys = <module 'sys' (built-in)>
+ >>> print $'BDFL = $sys.copyright.split()[4].upper()'
+ BDFL = GUIDO
+
+
+ Discussion
+
+ '$' is chosen as the interpolation character within the
+ string for the sake of familiarity, since it is already used
+ for this purpose in many other languages and contexts.
+
+ It is then natural to choose '$' as a prefix, since it is a
+ mnemonic for the interpolation character.
+
+ Trailers are permitted to give this interpolation mechanism
+ even more power than the interpolation available in most other
+ languages, while the expression to be interpolated remains
+ clearly visible and free of curly braces.
+
+ '$' works like an operator and could be implemented as an
+ operator, but that prevents the compile-time optimization
+ and presents security issues. So, it is only allowed as a
+ string prefix.
+
+
+ Security Issues
+
+ "$" has the power to eval, but only to eval a literal. As
+ described here (a string prefix rather than an operator), it
+ introduces no new security issues since the expressions to be
+ evaluated must be literally present in the code.
+
+
+ Implementation
+
+ The Itpl module at http://www.lfw.org/python/Itpl.py provides a
+ prototype of this feature. It uses the tokenize module to find
+ the end of an expression to be interpolated, then calls eval()
+ on the expression each time a value is needed. In the prototype,
+ the expression is parsed and compiled again each time it is
+ evaluated.
+
+ As an optimization, interpolated strings could be compiled
+ directly into the corresponding bytecode; that is,
+
+ $'a = $a, b = $b'
+
+ could be compiled as though it were the expression
+
+ ('a = ' + str(a) + ', b = ' + str(b))
+
+ so that it only needs to be compiled once.
+