[Python-Dev] Subtle difference between f-strings and str.format()
Serhiy Storchaka
storchaka at gmail.com
Wed Mar 28 11:27:19 EDT 2018
There is a subtle semantic difference between str.format() and
"equivalent" f-string.
'{}{}'.format(a, b)
f'{a}{b}'
In the former case b is evaluated before formatting a. This is equivalent to
t1 = a
t2 = b
t3 = format(t1)
t4 = format(t2)
r = t3 + t4
In the latter case a is formatted before evaluating b. This is equivalent to
t1 = a
t2 = format(t1)
t3 = b
t4 = format(t3)
r = t2 + t4
In most cases this doesn't matter, but when implement the optimization
that transforms the former expression to the the latter one ([1], [2])
we have to make a decision what to do with this difference.
1. Keep the exact semantic of str.format() when optimize it. This means
that it should be transformed into AST node different from the AST node
used for f-strings. Either introduce a new AST node type, or add a
boolean flag to JoinedStr.
2. Change the semantic of f-strings. Make it closer to the semantic of
str.format(): evaluate all subexpressions first than format them. This
can be implemented in two ways:
2a) Add additional instructions for stack manipulations. This will slow
down f-strings.
2b) Introduce a new complex opcode that will replace FORMAT_VALUE and
BUILD_STRING. This will speed up f-strings.
3. Transform str.format() into an f-string with changing semantic, and
ignore this change. This is not new. The optimizer already changes
semantic. Non-optimized "if a and True:" would call bool(a) twice, but
optimized code calls it only once.
[1] https://bugs.python.org/issue28307
[2] https://bugs.python.org/issue28308
More information about the Python-Dev
mailing list