[Python-ideas] Let’s make escaping in f-literals impossible

Thu Aug 18 12:50:11 EDT 2016

I'm generally inclined to agree, especially as someone who is very 
likely to be implementing syntax highlighting and completion support 
within f-literals.

I stepped out of the original discussion near the start as it looked 
like we were going to end up with interleaved strings and normal 
expressions, but if that's not the case then it is going to make it very 
difficult to provide a nice coding experience for them.

On 18Aug2016 0805, Philipp A. wrote:
> My poposal is to redo their grammar:
> They shouldn’t be parsed as strings and post-processed, but be their own
> thing. This also opens the door to potentially extend to with something
> like JavaScript’s tagged templates)
>
> Without the limitations of the string tokenization code/rules, only the
> string parts would have escape sequences, and the expression parts would
> be regular python code (“holes” in the literal).

This is where I thought we'd end up - the '{' character (unless escaped 
by, e.g. \N, which addresses a concern below) would terminate the string 
literal and start an expression, which may be followed by a ':' and a 
format code literal. The '}' character would open the next string 
literal, and this continues until the closing quote.

>     They are still strings, there is just post-processing on the string
>     itself to do the interpolation.
>
>
> Sounds hacky to me. I’d rather see a proper parser for them

I believe the proper parser is already used, but the issue is that 
escapes have already been dealt with. Of course, it shouldn't be too 
difficult for the tokenizer to recognize {} quoted expressions within an 
f-literal and not modify escapes. There are multiple ways to handle this.

>     Or another reason is you can explain f-strings as "basically
>     str.format_map(**locals(), **globals()), but without having to make
>     the actual method call" (and worrying about clashing keys but I
>     couldn't think of a way of using dict.update() in a single line).
>     But with your desired change it kills this explanation by saying
>     f-strings aren't like this but some magical string that does all of
>     this stuff before normal string normalization occurs.
>
>
> no, it’s simply the expression parts (that for normal formatting are
> inside of the braces of  .format(...)) are *interleaved* in between
> string parts. they’re not part of the string. just regular plain python
> code.

Agreed. The .format_map() analogy breaks down very quickly when you 
consider f-literals like:

 >>> f'a { \'b\' }'
'a b'

If the contents of the braces were simply keys in the namespace then we 
wouldn't be able to put string literals in there. But because it is an 
arbitrary expression, if we want to put string literals in the f-literal 
(bearing in mind that we may be writing something more like 
f'{x.partition(\'-\')[0]}'), the escaping rules become very messy very 
quickly.

I don't think f'{x.partition('-')[0]}' is any less readable as a result 
of the reused quotes, and it will certainly be easier for highlighters 
to handle (assuming they're doing anything more complicated than simply 
displaying the entire expression in a different colour).

So I too would like to see escapes made unnecessary within the 
expression part of a f-literal. Possibly if we put together a simple 
enough patch for the tokenizer it will be accepted?

Cheers,
Steve