[Python-ideas] Briefer string format

Steven D'Aprano steve at pearwood.info
Thu Jul 23 05:31:14 CEST 2015


Executive summary for those in a hurry: 

* implicit concatenation of strings *of any type* always occurs at 
  compile-time;

* if the first (or any?) of the concat'ed fragments begin with an f 
  prefix, then the resulting concatenated string is deemed to begin 
  with an f prefix and is compiled to a call to format (or some 
  other appropriate implementation), which is a run-time operation;

* the peep-hole optimizer has to avoid concat'ing mixed f and 
  non-f strings: f'{spam}' + '{eggs}' should evaluate to something
  like (format(spam) + '{eggs}').


Longer version with more detail below.


On Wed, Jul 22, 2015 at 02:52:30PM -0400, Eric V. Smith wrote:
> On 07/20/2015 03:22 PM, Guido van Rossum wrote:
> > Not sure what you mean by "implicit merging" -- if you mean literal
> > concatenation (e.g. 'foo' "bar" == 'foobar') then I think it should be
> > allowed, just like we support mixing quotes and r''.
> 
> Do we really want to support this? It complicates the implementation,
> and I'm not sure of the value.
> 
> f'{foo}' 'bar' f'{baz}'
> becomes something like:
> format(foo) + 'bar' + format(baz)
> 
> You're not merging similar things, like you are with normal string
> concatenation.

I would not want or expect that behaviour. However, I would want and 
expect that behaviour with *explicit* concatenation using the + 
operator. I would want the peephole optimizer to avoid optimizing this 
case:

    f'{foo}' + 'bar' + f'{baz}'

and allow it to be compiled to something like:

    format(foo) + 'bar' + format(baz)


With explicit concatenation, the format() calls occur before the + 
operators are called.

Constant-folding 'a' + 'b' to 'ab' is an optimization, it doesn't change 
the semantics of the concat. But constant-folding f'{a}' + '{b}' would 
change the semantics of the concatenation, because f strings aren't 
constants, they only look like them.

In the case of *implicit* concatenation, I think that the concatenations 
should occur first, at compile time. Yes, that deliberately introduces a 
difference between implicit and explicit concatenation, that's a 
feature, not a bug!

Implicit concatenation will help in the same cases that implicit 
concatenation usually helps: long strings without newlines:

msg = (f'a long message here blah blah {x}'
       f' and {y} and {z} and more {stuff} and {things}'
       f' and perhaps even more {poppycock}'
       )

That should be treated as syntactically equivalent to:

msg = f'a long message here blah blah {x} and {y} and {z} and more {stuff} and {things} and perhaps even more {poppycock}'

which is then compiled into the usual format(...) magic, as normal. So, 
a very strong +1 on allowing implicit concatenation.

I would go further and allow all the f prefixes apart from the first to 
be optional. To put it another way, the first f prefix "infects" all the 
other string fragments:

msg = (f'a long message here blah blah {x}'
        ' and {y} and {z} and more {stuff} and {things}'
        ' and perhaps even more {poppycock}'
       )

should be exactly the same as the first version. My reasoning is that 
the implicit concatenation always occurs first, so by the time the 
format(...) magic occurs at run-time, the interpreter no long knows 
which braces came from an f-string and which came from a regular string.

(Implicit concatenation is a compile-time operation, the format(...) 
stuff is run-time, so there is a clear and logical order of operations.)

To avoid potential surprises, I would disallow the case where the f 
prefix doesn't occur in the first fragment, or at least raise a 
compile-time warning:

    'spam' 'eggs' f'{cheese}'

should raise or warn. (That restriction could be removed in the future, 
if it turns out not to be a problem.)

 
> And merging f-strings:
> f'{foo}' f'{bar'}
> similarly just becomes concatenating the results of some function calls.

That's safe to do at compile-time:

  f'{foo}' f'{bar}'
  f'{foo}{bar}'

will always be the same. There's no need to delay the concat until after 
the formats.



-- 
Steve


More information about the Python-ideas mailing list