[Python-checkins] peps: Note how f-strings are tokenized and decoded before scanning for expressions.

eric.smith python-checkins at python.org
Sat Sep 5 02:28:17 CEST 2015


https://hg.python.org/peps/rev/b8edd3309920
changeset:   6031:b8edd3309920
user:        Eric V. Smith <eric at trueblade.com>
date:        Fri Sep 04 20:28:35 2015 -0400
summary:
  Note how f-strings are tokenized and decoded before scanning for expressions.

files:
  pep-0498.txt |  43 ++++++++++++++++++++++++++++++++-------
  1 files changed, 35 insertions(+), 8 deletions(-)


diff --git a/pep-0498.txt b/pep-0498.txt
--- a/pep-0498.txt
+++ b/pep-0498.txt
@@ -174,14 +174,30 @@
 binary f-strings. 'f' may also be combined with 'u', in either order,
 although adding 'u' has no effect.
 
-f-strings are parsed in to literals and expressions. Expressions
-appear within curly braces '{' and '}. The parts of the string outside
-of braces are literals.  The expressions are evaluated, formatted with
-the existing __format__ protocol, then the results are concatenated
-together with the string literals. While scanning the string for
-expressions, any doubled braces '{{' or '}}' are replaced by the
-corresponding single brace. Doubled opening braces do not signify the
-start of an expression.
+f-strings are tokenized using the same rules as normal strings, raw
+strings, binary strings, and triple quoted strings. That is, the
+string must end with the same character that it started with: if it
+starts with a single quote it must end with a single quote, etc.  This
+implies that any code that currently scans Python code looking for
+strings should be trivially modifiable to recognize f-strings (parsing
+within an f-string is another matter, of course).
+
+Once tokenized, f-strings are decoded. This will convert backslash
+escapes such as ``\n``, ``\xhh``, ``\uxxxx``, ``\Uxxxxxxxx``, and
+named unicode characters ``\N{name}`` into their associated Unicode
+characters [#]_.
+
+Up to this point, the processing of f-strings and normal strings is
+exactly the same.
+
+The difference is that f-strings are then parsed in to literals and
+expressions. Expressions appear within curly braces '{' and '}. The
+parts of the string outside of braces are literals.  The expressions
+are evaluated, formatted with the existing __format__ protocol, then
+the results are concatenated together with the string literals. While
+scanning the string for expressions, any doubled braces '{{' or '}}'
+are replaced by the corresponding single brace. Doubled opening braces
+do not signify the start of an expression.
 
 Following the expression, an optional type conversion may be
 specified.  The allowed conversions are ``'!s'``, ``'!r'``, or
@@ -228,6 +244,14 @@
 escape sequences are processed before f-strings are parsed for
 expressions.
 
+Note that the correct way to have a literal brace appear in the
+resulting string value is to double the brace::
+
+  >>> f'{{ {4*10} }}'
+  '{ 40 }'
+  >>> f'{{{4*10}}}'
+  '{40}'
+
 Code equivalence
 ----------------
 
@@ -659,6 +683,9 @@
 .. [#] Format string syntax
        (https://docs.python.org/3/library/string.html#format-string-syntax)
 
+.. [#] String literal description
+       (https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals)
+
 .. [#] ast.parse() documentation
        (https://docs.python.org/3/library/ast.html#ast.parse)
 

-- 
Repository URL: https://hg.python.org/peps


More information about the Python-checkins mailing list