<div dir="ltr"><div class="gmail_quote"><div>Hi, I originially posted this via google groups, which didn’t make it through to the list proper, sorry! Read it here please: <a href="https://groups.google.com/forum/#!topic/python-ideas/V1U6DGL5J1s">https://groups.google.com/forum/#!topic/python-ideas/V1U6DGL5J1s</a><br><br></div><div>My arguments are basically:<br><ol><li>f-literals are semantically not strings, but expressions.</li><li>Their escape sequences in the code parts are fundamentally both detrimental and superfluous (they’re only in for convenience, as confirmed by Guido in the quote below)</li><li>They’re detrimental because Syntax highlighters are (by design) unable to handle this part of Python 3.6a4’s grammar. This will cause code to be highlighted as parts of a string and therefore overlooked. i’m very sure this will cause bugs.</li><li>The fact that people see the embedded expressions as somehow “part of the string” is confusing.<br></li></ol></div><div>My poposal is to redo their grammar:<br></div><div>They shouldn’t be parsed as strings and post-processed, but be their own thing. This also opens the door to potentially extend to with something like JavaScript’s tagged templates)<br><br></div><div>Without the limitations of the string tokenization code/rules, only the string parts would have escape sequences, and the expression parts would be regular python code (“holes” in the literal).<br><br></div><div>Below the mentioned quote and some replies to the original thread:<br></div><div dir="ltr"><br>Guido van Rossum <<a href="mailto:guido@python.org">guido@python.org</a>> schrieb am Mi., 17. Aug. 2016 um 20:11 Uhr:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">The

 explanation is honestly that the current approach is the most 

straightforward for the implementation (it's pretty hard to intercept 

the string literal before escapes have been processed) and nobody cares 

enough about the edge cases to force the implementation to jump through 

more hoops.<br><div dir="ltr"><div class="gmail_extra"><br></div><div class="gmail_extra">I really don't think 

this discussion should be reopened. If you disagree, please start a new 

thread on python-ideas.<br clear="all"></div></div></blockquote><div><br></div><div>I really think it should. Please look at python code with f-literals. if they’re highlighted as strings throughout, you won’t be able to spot which parts are code. if they’re highlighted as code, the escaping rules guarantee that most highlighters can’t correctly highlight python anymore. i think that’s a big issue for readability.<br></div><div><br><div class="gmail_quote"><div dir="ltr">Brett Cannon <<a href="mailto:brett@python.org">brett@python.org</a>> schrieb am Mi., 17. Aug. 2016 um 20:28 Uhr:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div><span style="line-height:1.5">They are still strings, there is just post-processing on the string itself to do the interpolation.</span></div></div></div></blockquote><div><br></div><div>Sounds hacky to me. I’d rather see a proper parser for them, which of course would make my vision easy.<br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div></div></div></div><div dir="ltr"><div class="gmail_quote"><div>By doing it this way the implementation can use 

Python itself to do the tokenizing of the string, while if you do the 

string interpolation beforehand you would then need to do it entirely at

 the C level which is very messy and painful since you're explicitly 

avoiding Python's automatic handling of Unicode, etc.<br></div></div></div></blockquote><div><br></div><div>of course we reuse the tokenization for the string parts. as said, you can view an f-literal as interleaved sequence of strings and expressions with an attached format specification.<br><br></div><div><f'> starts the f-literal, string contents follow. the only difference to other strings is<br></div><div><{> which starts expression tokenization. once the expression ends, an optional<br></div><div><formatspec> follows, then a<br></div><div><}> to switch back to string tokenization<br></div><div>this repeats until (in string parsing mode) a<br></div><div><'> is encountered which ends the f-literal.<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div></div><div></div><div>You also make it harder to work with 

Unicode-based variable names (or at least explain it). If you have 

Unicode in a variable name but you can't use \N{} in the string to help 

express it you then have to say "normal Unicode support in the string 

applies everywhere *but* in the string interpolation part".</div></div></div></blockquote><div><br></div><div>i think you’re just proving my point that the way f-literals work now is confusing.<br><br></div><div>the embedded expressions are just normal python. the embedded strings just normal strings. you can simply switch between both using <{> and <[format]}>.<br></div><div><br></div><div>unicode in variable names works exactly the same as in all other python code because it is regular python code.<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div></div><div>Or

 another reason is you can explain f-strings as "basically 

str.format_map(**locals(), **globals()), but without having to make the 

actual method call" (and worrying about clashing keys but I couldn't 

think of a way of using dict.update() in a single line). But with your 

desired change it kills this explanation by saying f-strings aren't like

 this but some magical string that does all of this stuff before normal 

string normalization occurs.</div></div></div></blockquote><div><br></div><div>no, it’s simply the expression parts (that for normal formatting are inside of the braces of  .format(...)) are *interleaved* in between string parts. they’re not part of the string. just regular plain python code.<br><br></div><div>Cheers, and i really hope i’ve made a strong case,<br></div><div>philipp<br></div></div></div></div></div>