[Python-checkins] CVS: python/nondist/peps pep-0215.txt,1.2,1.3

Ka-Ping Yee ping@users.sourceforge.net
Tue, 30 Jan 2001 09:09:55 -0800


Update of /cvsroot/python/python/nondist/peps
In directory usw-pr-cvs1:/tmp/cvs-serv312

Modified Files:
	pep-0215.txt 
Log Message:
Initial draft of string interpolation PEP.


Index: pep-0215.txt
===================================================================
RCS file: /cvsroot/python/python/nondist/peps/pep-0215.txt,v
retrieving revision 1.2
retrieving revision 1.3
diff -C2 -r1.2 -r1.3
*** pep-0215.txt	2000/08/23 06:04:33	1.2
--- pep-0215.txt	2001/01/30 17:09:53	1.3
***************
*** 9,12 ****
--- 9,137 ----
  Post-History:
  
+ Abstract
+ 
+     This document proposes a string interpolation feature for Python
+     to allow easier string formatting.  The suggested syntax change
+     is the introduction of a '$' prefix that triggers the special
+     interpretation of the '$' character within a string, in a manner
+     reminiscent to the variable interpolation found in Unix shells,
+     awk, Perl, or Tcl.
+ 
+ 
+ Copyright
+ 
+     This document is in the public domain.
+ 
+ 
+ Specification
+ 
+     Strings may be preceded with a '$' prefix that comes before the
+     leading single or double quotation mark (or triplet) and before
+     any of the other string prefixes ('r' or 'u').  Such a string is
+     processed for interpolation after the normal interpretation of
+     backslash-escapes in its contents.  The processing occurs just
+     before the string is pushed onto the value stack, each time the
+     string is pushed.  In short, Python behaves exactly as if '$'
+     were a unary operator applied to the string.  The operation
+     performed is as follows:
+ 
+     The string is scanned from start to end for the '$' character
+     (\x24 in 8-bit strings or \u0024 in Unicode strings).  If there
+     are no '$' characters present, the string is returned unchanged.
+ 
+     Any '$' found in the string, followed by one of the two kinds of
+     expressions described below, is replaced with the value of the
+     expression as evaluated in the current namespaces.  The value is
+     converted with str() if the containing string is an 8-bit string,
+     or with unicode() if it is a Unicode string.
+ 
+     1.  A Python identifier optionally followed by any number of
+         trailers, where a trailer consists of:
+             - a dot and an identifier,
+             - an expression enclosed in square brackets, or
+             - an argument list enclosed in parentheses
+         (This is exactly the pattern expressed in the Python grammar
+         by "NAME trailer*", using the definitions in Grammar/Grammar.)
+ 
+     2.  Any complete Python expression enclosed in curly braces.
+ 
+     Two dollar-signs ("$$") are replaced with a single "$".
+ 
+ 
+ Examples
+ 
+     Here is an example of an interactive session exhibiting the
+     expected behaviour of this feature.
+ 
+         >>> a, b = 5, 6
+         >>> print $'a = $a, b = $b'
+         a = 5, b = 6
+         >>> $u'uni${a}ode'
+         u'uni5ode'
+         >>> print $'\$a'
+         5
+         >>> print $r'\$a'
+         \5
+         >>> print $'$$$a.$b'
+         $5.6
+         >>> print $'a + b = ${a + b}'
+         a + b = 11
+         >>> import sys
+         >>> print $'References to $a: $sys.getrefcount(a)'
+         References to 5: 15
+         >>> print $"sys = $sys, sys = $sys.modules['sys']"
+         sys = <module 'sys' (built-in)>, sys = <module 'sys' (built-in)>
+         >>> print $'BDFL = $sys.copyright.split()[4].upper()'
+         BDFL = GUIDO
+ 
+ 
+ Discussion
+ 
+     '$' is chosen as the interpolation character within the
+     string for the sake of familiarity, since it is already used
+     for this purpose in many other languages and contexts.
+ 
+     It is then natural to choose '$' as a prefix, since it is a
+     mnemonic for the interpolation character.
+ 
+     Trailers are permitted to give this interpolation mechanism
+     even more power than the interpolation available in most other
+     languages, while the expression to be interpolated remains
+     clearly visible and free of curly braces.
+ 
+     '$' works like an operator and could be implemented as an
+     operator, but that prevents the compile-time optimization
+     and presents security issues.  So, it is only allowed as a
+     string prefix.
+ 
+ 
+ Security Issues
+ 
+     "$" has the power to eval, but only to eval a literal.  As
+     described here (a string prefix rather than an operator), it
+     introduces no new security issues since the expressions to be
+     evaluated must be literally present in the code.
+ 
+ 
+ Implementation
+ 
+     The Itpl module at http://www.lfw.org/python/Itpl.py provides a
+     prototype of this feature.  It uses the tokenize module to find
+     the end of an expression to be interpolated, then calls eval()
+     on the expression each time a value is needed.  In the prototype,
+     the expression is parsed and compiled again each time it is
+     evaluated.
+ 
+     As an optimization, interpolated strings could be compiled
+     directly into the corresponding bytecode; that is,
+ 
+         $'a = $a, b = $b'
+ 
+     could be compiled as though it were the expression
+ 
+         ('a = ' + str(a) + ', b = ' + str(b))
+ 
+     so that it only needs to be compiled once.
+