[Python-checkins] peps: PEP 501: string prefix redux, now with template objects

nick.coghlan python-checkins at python.org
Sun Aug 23 04:39:05 CEST 2015


https://hg.python.org/peps/rev/709fd2bc3720
changeset:   5979:709fd2bc3720
user:        Nick Coghlan <ncoghlan at gmail.com>
date:        Sun Aug 23 12:38:55 2015 +1000
summary:
  PEP 501: string prefix redux, now with template objects

files:
  pep-0501.txt |  264 +++++++++++++++++++-------------------
  1 files changed, 135 insertions(+), 129 deletions(-)


diff --git a/pep-0501.txt b/pep-0501.txt
--- a/pep-0501.txt
+++ b/pep-0501.txt
@@ -29,29 +29,39 @@
 has not been properly escaped before being passed to the ``os.system`` call.
 
 To address that problem (and a number of other concerns), this PEP proposes an
-alternative approach to compiler supported interpolation, based on a new ``$``
-binary operator with a syntactically constrained right hand side, a new
-``__interpolate__`` magic method, and a substitution syntax inspired by
-that used in ``string.Template`` and ES6 JavaScript, rather than adding a 4th
-substitution variable syntax to Python.
+alternative approach to compiler supported interpolation, using ``i`` (for
+"interpolation") as the new string prefix and a substitution syntax
+inspired by that used in ``string.Template`` and ES6 JavaScript, rather than
+adding a 4th substitution variable syntax to Python.
 
-Some examples of the proposed syntax::
+Some possible examples of the proposed syntax::
 
-    msg = str$'My age next year is ${age+1}, my anniversary is ${anniversary:%A, %B %d, %Y}.'
-    print(_$"This is a $translated $message")
-    translated = l20n$"{{ $user }} is running {{ appname }}"
-    myquery = sql$"SELECT $column FROM $table;"
-    mycommand = sh$"cat $filename"
-    mypage = html$"<html><body>${response.body}</body></html>"
-    callable = defer$ "$x + $y"
+    msg = str(i'My age next year is ${age+1}, my anniversary is ${anniversary:%A, %B %d, %Y}.')
+    print(_(i"This is a $translated $message"))
+    translated = l20n(i"{{ $user }} is running {{ appname }}")
+    myquery = sql(i"SELECT $column FROM $table;")
+    mycommand = sh(i"cat $filename")
+    mypage = html(i"<html><body>${response.body}</body></html>")
+    callable = defer(i"$x + $y")
+
+Summary of differences from PEP 498
+===================================
+
+The key differences of this proposal relative to PEP 498:
+
+* "i" (interpolation template) prefix rather than "f" (formatted string)
+* string.Template/JavaScript inspired substitution syntax, rather than str.format/C# inspired
+* interpolation templates are created at runtime as a new kind of object
+* the default rendering is invoked by calling ``str()`` on a template object
+  rather than automatically
 
 Proposal
 ========
 
-This PEP proposes the introduction of a new binary operator specifically for
-interpolation of arbitrary expressions::
+This PEP proposes the introduction of a new string prefix that declares the
+string to be an interpolation template rather than an ordinary string::
 
-    value = interpolator $ "Substitute $names and ${expressions} at runtime"
+    template = $"Substitute $names and ${expressions} at runtime"
 
 This would be effectively interpreted as::
 
@@ -62,28 +72,25 @@
         (" at runtime", None, None, None, None),
     )
     _field_values = (names, expressions)
-    value = interpolator.__interpolate__(_raw_template,
-                                         _parsed_fields,
-                                         _field_values)
+    template = types.InterpolationTemplate(_raw_template,
+                                           _parsed_fields,
+                                           _field_values)
 
-The right hand side of the new operator would be syntactically constrained to
-be a string literal.
-
-The ``str`` builtin type would gain an ``__interpolate__`` implementation that
-supported the following ``str.format`` inspired semantics::
+The ``__str__`` method on ``types.InterpolationTemplate`` would then implementat
+the following ``str.format`` inspired semantics::
 
   >>> import datetime
   >>> name = 'Jane'
   >>> age = 50
   >>> anniversary = datetime.date(1991, 10, 12)
-  >>> str$'My name is $name, my age next year is ${age+1}, my anniversary is ${anniversary:%A, %B %d, %Y}.'
+  >>> str(i'My name is $name, my age next year is ${age+1}, my anniversary is ${anniversary:%A, %B %d, %Y}.')
   'My name is Jane, my age next year is 51, my anniversary is Saturday, October 12, 1991.'
-  >>> str$'She said her name is ${name!r}.'
+  >>> str(i'She said her name is ${name!r}.')
   "She said her name is 'Jane'."
 
-The interpolation operator could be used with single-quoted, double-quoted and
-triple quoted strings, including raw strings. It would not support bytes
-literals as the right hand side of the expression.
+The interpolation template prefix can be combined with single-quoted,
+double-quoted and triple quoted strings, including raw strings. It does not
+support combination with bytes literals.
 
 This PEP does not propose to remove or deprecate any of the existing
 string formatting mechanisms, as those will remain valuable when formatting
@@ -102,9 +109,11 @@
 expressions into Python, when we already have 3 (``str.format``,
 ``bytes.__mod__`` and ``string.Template``)
 
-This PEP proposes to handle the former issue by always specifying an explicit
-interpolator for interpolation operations, and the latter by adopting the
-``string.Template`` substitution syntax defined in PEP 292.
+This PEP proposes to handle the former issue by deferring the actual rendering
+of the interpolation template to its ``__str__`` method (allow the use of
+other template renderers by passing the template around as an object), and the
+latter by adopting the ``string.Template`` substitution syntax defined in PEP
+292.
 
 The substitution syntax devised for PEP 292 is deliberately simple so that the
 template strings can be extracted into an i18n message catalog, and passed to
@@ -133,18 +142,13 @@
 Specification
 =============
 
-This PEP proposes the introduction of ``$`` as a new binary operator designed
-specifically to support interpolation of template strings::
+This PEP proposes the introduction of ``i`` as a new string prefix that
+results in the creation of an instance of a new type,
+``types.InterpolationTemplate``.
 
-    INTERPOLATOR $ TEMPLATE_STRING
-
-This would work as a normal binary operator (precedence TBD), with the
-exception that the template string would be syntactically constrained to be a
-string literal, rather than permitting arbitrary expressions.
-
-The template string must be a Unicode string (bytes literals are not permitted),
-and string literal concatenation operates as normal within the template string
-component of the expression.
+Interpolation template literals are Unicode strings (bytes literals are not
+permitted), and string literal concatenation operates as normal, with the
+entire combined literal forming the interpolation template.
 
 The template string is parsed into literals and expressions. Expressions
 appear as either identifiers prefixed with a single "$" character, or
@@ -155,15 +159,37 @@
 and is considered part of the literal text, rather than as introducing an
 expression.
 
-These components are then organised into a tuple of tuples, and passed to the
-``__interpolate__`` method of the interpolator identified by the given
-name along with the runtime values of any expressions to be interpolated::
+These components are then organised into an instance of a new type with the
+following semantics::
 
-    DOTTED_NAME.__interpolate__(TEMPLATE_STRING,
-                                <parsed_fields>,
-                                <field_values>)
+    class InterpolationTemplate:
+        __slots__ = ("raw_template", "parsed_fields", "field_values")
 
-The template string field tuple is inspired by the interface of
+        def __new__(cls, raw_template, parsed_fields, field_values):
+            self = super().__new__()
+            self.raw_template = raw_template
+            self.parsed_fields = parsed_fields
+            self.field_values = field_values
+            return self
+
+        def __iter__(self):
+            # Support iterable unpacking
+            yield self.raw_template
+            yield self.parsed_fields
+            yield self.field_values
+
+        def __repr__(self):
+            return str(i"<${type(self).__qualname__} ${self.raw_template!r} "
+                        "at ${id(self):#x}>")
+
+        def __str__(self):
+            # See definition of the default template rendering below
+
+The result of the interpolation template expression is an instance of this
+type, rather than an already rendered string - default rendering only takes
+place when the instance's ``__str__`` method is called.
+
+The format of the parsed fields tuple is inspired by the interface of
 ``string.Formatter.parse``, and consists of a series of 5-tuples each
 containing:
 
@@ -191,7 +217,7 @@
 expression markers. The conversion specifier and format specifier are separated
 from the substition expression by ``!`` and ``:`` as defined for ``str.format``.
 
-If a given substition field has no leading literal section, coversion specifier
+If a given substition field has no leading literal section, conversion specifier
 or format specifier, then the corresponding elements in the tuple are the
 empty string. If the final part of the string has no trailing substitution
 field, then the field position, field expression, conversion specifier and
@@ -222,13 +248,14 @@
 The parsed fields tuple can be constant folded at compile time, while the
 expression values tuple will always need to be constructed at runtime.
 
-The ``str.__interpolate__`` implementation would have the following
+The ``InterpolationTemplate.__str__`` implementation would have the following
 semantics, with field processing being defined in terms of the ``format``
 builtin and ``str.format`` conversion specifiers::
 
     _converter = string.Formatter().convert_field
 
-    def __interpolate__(raw_template, fields, values):
+    def __str__(self):
+        raw_template, fields, values = self
         template_parts = []
         for leading_text, field_num, expr, conversion, format_spec in fields:
             template_parts.append(leading_text)
@@ -243,18 +270,10 @@
 Writing custom interpolators
 ----------------------------
 
-To simplify the process of writing custom interpolators, it is proposed to add
-a new builtin decorator, ``interpolator``, which would be defined as::
-
-    def interpolator(f):
-        f.__interpolate__ = f.__call__
-        return f
-
-This allows new interpolators to be written as::
-
-    @interpolator
-    def my_custom_interpolator(raw_template, parsed_fields, field_values):
-        ...
+Writing a custom interpolator doesn't requiring any special syntax. Instead,
+custom interpolators are ordinary callables that process an interpolation
+template directly based on the ``raw_template``, ``parsed_fields`` and
+``field_values`` attributes, rather than relying on the default rendered.
 
 
 Expression evaluation
@@ -287,12 +306,12 @@
 Handling code injection attacks
 -------------------------------
 
-The proposed interpolation expressions make it potentially attractive to write
+The proposed interpolation syntax makes it potentially attractive to write
 code like the following::
 
-    myquery = str$"SELECT $column FROM $table;"
-    mycommand = str$"cat $filename"
-    mypage = str$"<html><body>${response.body}</body></html>"
+    myquery = str(i"SELECT $column FROM $table;")
+    mycommand = str(i"cat $filename")
+    mypage = str(i"<html><body>${response.body}</body></html>")
 
 These all represent potential vectors for code injection attacks, if any of the
 variables being interpolated happen to come from an untrusted source. The
@@ -300,15 +319,16 @@
 use case specific interpolators that take care of quoting interpolated values
 appropriately for the relevant security context::
 
-    myquery = sql$"SELECT $column FROM $table;"
-    mycommand = sh$"cat $filename"
-    mypage = html$"<html><body>${response.body}</body></html>"
+    myquery = sql(i"SELECT $column FROM $table;")
+    mycommand = sh(i"cat $filename")
+    mypage = html(i"<html><body>${response.body}</body></html>")
 
 This PEP does not cover adding such interpolators to the standard library,
 but instead ensures they can be readily provided by third party libraries.
 
-(Although it's tempting to propose adding __interpolate__ implementations to
-``subprocess.call``, ``subprocess.check_call`` and ``subprocess.check_output``)
+(Although it's tempting to propose adding InterpolationTemplate support at
+least to ``subprocess.call``, ``subprocess.check_call`` and
+``subprocess.check_output``)
 
 Format and conversion specifiers
 --------------------------------
@@ -328,20 +348,21 @@
 
 Unmatched braces::
 
-  >>> str$'x=${x'
+  >>> i'x=${x'
     File "<stdin>", line 1
   SyntaxError: missing '}' in interpolation expression
 
 Invalid expressions::
 
-  >>> str$'x=${!x}'
+  >>> i'x=${!x}'
     File "<fstring>", line 1
       !x
       ^
   SyntaxError: invalid syntax
 
-Run time errors occur when evaluating the expressions inside an
-template string. See PEP 498 for some examples.
+Run time errors occur when evaluating the expressions inside a
+template string before creating the interpolation template object. See PEP 498
+for some examples.
 
 Different interpolators may also impose additional runtime
 constraints on acceptable interpolated expressions and other formatting
@@ -359,9 +380,10 @@
 performs internationalisation. For example, the following implementation
 would delegate interpolation calls to ``string.Template``::
 
-    @interpolator
-    def i18n(template, fields, values):
-        translated = gettext.gettext(template)
+    def i18n(template):
+        # A real implementation would also handle normal strings
+        raw_template, fields, values = template
+        translated = gettext.gettext(raw_template)
         value_map = _build_interpolation_map(fields, values)
         return string.Template(translated).safe_substitute(value_map)
 
@@ -376,7 +398,7 @@
 And would could then be invoked as::
 
     # _ = i18n at top of module or injected into the builtins module
-    print(_$"This is a $translated $message")
+    print(_(i"This is a $translated $message"))
 
 Any actual i18n implementation would need to address other issues (most notably
 message catalog extraction), but this gives the general idea of what might be
@@ -389,14 +411,14 @@
 
 With the syntax in this PEP, an l20n interpolator could be written as::
 
-    translated = l20n$"{{ $user }} is running {{ appname }}"
+    translated = l20n(i"{{ $user }} is running {{ appname }}")
 
 With the syntax proposed in PEP 498 (and neglecting the difficulty of doing
 catalog lookups using PEP 498's semantics), the necessary brace escaping would
 make the string look like this in order to interpolate the user variable
 while preserving all of the expected braces::
 
-    interpolated = "{{{{ ${user} }}}} is running {{{{ appname }}}}"
+    locally_interpolated = f"{{{{ ${user} }}}} is running {{{{ appname }}}}"
 
 
 Possible integration with the logging module
@@ -408,13 +430,17 @@
 logging messages also poses a problem for extensive logging of runtime events
 for monitoring purposes.
 
-While beyond the scope of this initial PEP, the proposal described here could
-potentially be applied to the logging module's event reporting APIs, permitting
-relevant details to be captured using forms like::
+While beyond the scope of this initial PEP, interpolation template support
+could potentially be added to the logging module's event reporting APIs,
+permitting relevant details to be captured using forms like::
 
-    logging.debug$"Event: $event; Details: $data"
-    logging.critical$"Error: $error; Details: $data"
+    logging.debug(i"Event: $event; Details: $data")
+    logging.critical(i"Error: $error; Details: $data")
 
+As the interpolation template is passed in as an ordinary argument, other
+keyword arguments also remain available::
+
+    logging.critical(i"Error: $error; Details: $data", exc_info=True)
 
 Discussion
 ==========
@@ -422,45 +448,14 @@
 Refer to PEP 498 for additional discussion, as several of the points there
 also apply to this PEP.
 
-Using call syntax to support keyword-only parameters
-----------------------------------------------------
-
-The logging examples raise the question of whether or not it may be desirable
-to allow interpolators to accept arbitrary keyword arguments, and allow folks
-to write things like::
-
-    logging.critical$"Error: $error; Details: $data"(exc_info=True)
-
-in order to pass additional keyword only arguments to the interpolator.
-
-With the current PEP, such code would attempt to call the result of the
-interpolation operation. If interpolation keyword support was added, then
-calling the result of an interpolation operation directly would require
-parentheses for disambiguation::
-
-    (defer$ "$x + $y")()
-
-("defer" here would be an interpolator that compiled the supplied string as
-a piece of Python code with eagerly bound references to the containing
-namespace)
-
-Determining relative precedence
--------------------------------
-
-The PEP doesn't currently specify the relative precedence of the new operator,
-as the only examples considered so far concern standalone expressions or simple
-variable assignments.
-
-Development of a reference implementation based on the PEP 498 reference
-implementation may help answer that question.
-
 Deferring support for binary interpolation
 ------------------------------------------
 
 Supporting binary interpolation with this syntax would be relatively
-straightforward (just a matter of relaxing the syntactic restrictions on the
-right hand side of the operator), but poses a signficant likelihood of
-producing confusing type errors when a text interpolator was presented with
+straightforward (the elements in the parsed fields tuple would just be
+byte strings rather than text strings, and the default renderer would be
+markedly less useful), but poses a signficant likelihood of producing
+confusing type errors when a text interpolator was presented with
 binary input.
 
 Since the proposed operator is useful without binary interpolation support, and
@@ -474,13 +469,13 @@
 to interpolators. This greatly complicated the i18n example, as it needed to
 reconstruct the original template to pass to the message catalog lookup.
 
-Using a magic method rather than a global name lookup
------------------------------------------------------
+Creating a rich object rather than a global name lookup
+-------------------------------------------------------
 
 Earlier versions of this PEP used an ``__interpolate__`` builtin, rather than
-a magic method on an explicitly named interpolator. Naming the interpolator
-eliminated a lot of the complexity otherwise associated with shadowing the
-builtin function in order to modify the semantics of interpolation.
+a creating a new kind of object for later consumption by interpolation
+functions. Creating a rich descriptive object with a useful default renderer
+made it much easier to support customisation of the semantics of interpolation.
 
 Relative order of conversion and format specifier in parsed fields
 ------------------------------------------------------------------
@@ -499,6 +494,17 @@
 possible to write interpolators without caring about the precise field order
 at all.
 
+
+Acknowledgements
+================
+
+* Eric V. Smith for creating PEP 498 and demonstrating the feasibility of
+  arbitrary expression substitution in string interpolation
+* Barry Warsaw for the string.Template syntax defined in PEP 292
+* Armin Ronacher for pointing me towards Mozilla's l20n project
+* Mike Miller for his survey of programming language interpolation syntaxes in
+  PEP (TBD)
+
 References
 ==========
 

-- 
Repository URL: https://hg.python.org/peps


More information about the Python-checkins mailing list