
Executive summary for those in a hurry: * implicit concatenation of strings *of any type* always occurs at compile-time; * if the first (or any?) of the concat'ed fragments begin with an f prefix, then the resulting concatenated string is deemed to begin with an f prefix and is compiled to a call to format (or some other appropriate implementation), which is a run-time operation; * the peep-hole optimizer has to avoid concat'ing mixed f and non-f strings: f'{spam}' + '{eggs}' should evaluate to something like (format(spam) + '{eggs}'). Longer version with more detail below. On Wed, Jul 22, 2015 at 02:52:30PM -0400, Eric V. Smith wrote:
On 07/20/2015 03:22 PM, Guido van Rossum wrote:
Not sure what you mean by "implicit merging" -- if you mean literal concatenation (e.g. 'foo' "bar" == 'foobar') then I think it should be allowed, just like we support mixing quotes and r''.
Do we really want to support this? It complicates the implementation, and I'm not sure of the value.
f'{foo}' 'bar' f'{baz}' becomes something like: format(foo) + 'bar' + format(baz)
You're not merging similar things, like you are with normal string concatenation.
I would not want or expect that behaviour. However, I would want and expect that behaviour with *explicit* concatenation using the + operator. I would want the peephole optimizer to avoid optimizing this case: f'{foo}' + 'bar' + f'{baz}' and allow it to be compiled to something like: format(foo) + 'bar' + format(baz) With explicit concatenation, the format() calls occur before the + operators are called. Constant-folding 'a' + 'b' to 'ab' is an optimization, it doesn't change the semantics of the concat. But constant-folding f'{a}' + '{b}' would change the semantics of the concatenation, because f strings aren't constants, they only look like them. In the case of *implicit* concatenation, I think that the concatenations should occur first, at compile time. Yes, that deliberately introduces a difference between implicit and explicit concatenation, that's a feature, not a bug! Implicit concatenation will help in the same cases that implicit concatenation usually helps: long strings without newlines: msg = (f'a long message here blah blah {x}' f' and {y} and {z} and more {stuff} and {things}' f' and perhaps even more {poppycock}' ) That should be treated as syntactically equivalent to: msg = f'a long message here blah blah {x} and {y} and {z} and more {stuff} and {things} and perhaps even more {poppycock}' which is then compiled into the usual format(...) magic, as normal. So, a very strong +1 on allowing implicit concatenation. I would go further and allow all the f prefixes apart from the first to be optional. To put it another way, the first f prefix "infects" all the other string fragments: msg = (f'a long message here blah blah {x}' ' and {y} and {z} and more {stuff} and {things}' ' and perhaps even more {poppycock}' ) should be exactly the same as the first version. My reasoning is that the implicit concatenation always occurs first, so by the time the format(...) magic occurs at run-time, the interpreter no long knows which braces came from an f-string and which came from a regular string. (Implicit concatenation is a compile-time operation, the format(...) stuff is run-time, so there is a clear and logical order of operations.) To avoid potential surprises, I would disallow the case where the f prefix doesn't occur in the first fragment, or at least raise a compile-time warning: 'spam' 'eggs' f'{cheese}' should raise or warn. (That restriction could be removed in the future, if it turns out not to be a problem.)
And merging f-strings: f'{foo}' f'{bar'} similarly just becomes concatenating the results of some function calls.
That's safe to do at compile-time: f'{foo}' f'{bar}' f'{foo}{bar}' will always be the same. There's no need to delay the concat until after the formats. -- Steve