[Python-checkins] peps: Note how f-strings are tokenized and decoded before scanning for expressions.
eric.smith
python-checkins at python.org
Sat Sep 5 02:28:17 CEST 2015
https://hg.python.org/peps/rev/b8edd3309920
changeset: 6031:b8edd3309920
user: Eric V. Smith <eric at trueblade.com>
date: Fri Sep 04 20:28:35 2015 -0400
summary:
Note how f-strings are tokenized and decoded before scanning for expressions.
files:
pep-0498.txt | 43 ++++++++++++++++++++++++++++++++-------
1 files changed, 35 insertions(+), 8 deletions(-)
diff --git a/pep-0498.txt b/pep-0498.txt
--- a/pep-0498.txt
+++ b/pep-0498.txt
@@ -174,14 +174,30 @@
binary f-strings. 'f' may also be combined with 'u', in either order,
although adding 'u' has no effect.
-f-strings are parsed in to literals and expressions. Expressions
-appear within curly braces '{' and '}. The parts of the string outside
-of braces are literals. The expressions are evaluated, formatted with
-the existing __format__ protocol, then the results are concatenated
-together with the string literals. While scanning the string for
-expressions, any doubled braces '{{' or '}}' are replaced by the
-corresponding single brace. Doubled opening braces do not signify the
-start of an expression.
+f-strings are tokenized using the same rules as normal strings, raw
+strings, binary strings, and triple quoted strings. That is, the
+string must end with the same character that it started with: if it
+starts with a single quote it must end with a single quote, etc. This
+implies that any code that currently scans Python code looking for
+strings should be trivially modifiable to recognize f-strings (parsing
+within an f-string is another matter, of course).
+
+Once tokenized, f-strings are decoded. This will convert backslash
+escapes such as ``\n``, ``\xhh``, ``\uxxxx``, ``\Uxxxxxxxx``, and
+named unicode characters ``\N{name}`` into their associated Unicode
+characters [#]_.
+
+Up to this point, the processing of f-strings and normal strings is
+exactly the same.
+
+The difference is that f-strings are then parsed in to literals and
+expressions. Expressions appear within curly braces '{' and '}. The
+parts of the string outside of braces are literals. The expressions
+are evaluated, formatted with the existing __format__ protocol, then
+the results are concatenated together with the string literals. While
+scanning the string for expressions, any doubled braces '{{' or '}}'
+are replaced by the corresponding single brace. Doubled opening braces
+do not signify the start of an expression.
Following the expression, an optional type conversion may be
specified. The allowed conversions are ``'!s'``, ``'!r'``, or
@@ -228,6 +244,14 @@
escape sequences are processed before f-strings are parsed for
expressions.
+Note that the correct way to have a literal brace appear in the
+resulting string value is to double the brace::
+
+ >>> f'{{ {4*10} }}'
+ '{ 40 }'
+ >>> f'{{{4*10}}}'
+ '{40}'
+
Code equivalence
----------------
@@ -659,6 +683,9 @@
.. [#] Format string syntax
(https://docs.python.org/3/library/string.html#format-string-syntax)
+.. [#] String literal description
+ (https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals)
+
.. [#] ast.parse() documentation
(https://docs.python.org/3/library/ast.html#ast.parse)
--
Repository URL: https://hg.python.org/peps
More information about the Python-checkins
mailing list