[Python-ideas] Briefer string format
Steven D'Aprano
steve at pearwood.info
Thu Jul 23 05:31:14 CEST 2015
Executive summary for those in a hurry:
* implicit concatenation of strings *of any type* always occurs at
compile-time;
* if the first (or any?) of the concat'ed fragments begin with an f
prefix, then the resulting concatenated string is deemed to begin
with an f prefix and is compiled to a call to format (or some
other appropriate implementation), which is a run-time operation;
* the peep-hole optimizer has to avoid concat'ing mixed f and
non-f strings: f'{spam}' + '{eggs}' should evaluate to something
like (format(spam) + '{eggs}').
Longer version with more detail below.
On Wed, Jul 22, 2015 at 02:52:30PM -0400, Eric V. Smith wrote:
> On 07/20/2015 03:22 PM, Guido van Rossum wrote:
> > Not sure what you mean by "implicit merging" -- if you mean literal
> > concatenation (e.g. 'foo' "bar" == 'foobar') then I think it should be
> > allowed, just like we support mixing quotes and r''.
>
> Do we really want to support this? It complicates the implementation,
> and I'm not sure of the value.
>
> f'{foo}' 'bar' f'{baz}'
> becomes something like:
> format(foo) + 'bar' + format(baz)
>
> You're not merging similar things, like you are with normal string
> concatenation.
I would not want or expect that behaviour. However, I would want and
expect that behaviour with *explicit* concatenation using the +
operator. I would want the peephole optimizer to avoid optimizing this
case:
f'{foo}' + 'bar' + f'{baz}'
and allow it to be compiled to something like:
format(foo) + 'bar' + format(baz)
With explicit concatenation, the format() calls occur before the +
operators are called.
Constant-folding 'a' + 'b' to 'ab' is an optimization, it doesn't change
the semantics of the concat. But constant-folding f'{a}' + '{b}' would
change the semantics of the concatenation, because f strings aren't
constants, they only look like them.
In the case of *implicit* concatenation, I think that the concatenations
should occur first, at compile time. Yes, that deliberately introduces a
difference between implicit and explicit concatenation, that's a
feature, not a bug!
Implicit concatenation will help in the same cases that implicit
concatenation usually helps: long strings without newlines:
msg = (f'a long message here blah blah {x}'
f' and {y} and {z} and more {stuff} and {things}'
f' and perhaps even more {poppycock}'
)
That should be treated as syntactically equivalent to:
msg = f'a long message here blah blah {x} and {y} and {z} and more {stuff} and {things} and perhaps even more {poppycock}'
which is then compiled into the usual format(...) magic, as normal. So,
a very strong +1 on allowing implicit concatenation.
I would go further and allow all the f prefixes apart from the first to
be optional. To put it another way, the first f prefix "infects" all the
other string fragments:
msg = (f'a long message here blah blah {x}'
' and {y} and {z} and more {stuff} and {things}'
' and perhaps even more {poppycock}'
)
should be exactly the same as the first version. My reasoning is that
the implicit concatenation always occurs first, so by the time the
format(...) magic occurs at run-time, the interpreter no long knows
which braces came from an f-string and which came from a regular string.
(Implicit concatenation is a compile-time operation, the format(...)
stuff is run-time, so there is a clear and logical order of operations.)
To avoid potential surprises, I would disallow the case where the f
prefix doesn't occur in the first fragment, or at least raise a
compile-time warning:
'spam' 'eggs' f'{cheese}'
should raise or warn. (That restriction could be removed in the future,
if it turns out not to be a problem.)
> And merging f-strings:
> f'{foo}' f'{bar'}
> similarly just becomes concatenating the results of some function calls.
That's safe to do at compile-time:
f'{foo}' f'{bar}'
f'{foo}{bar}'
will always be the same. There's no need to delay the concat until after
the formats.
--
Steve
More information about the Python-ideas
mailing list