New subject: [Python-ideas] Let’s make escaping in f-literals impossible

Aug. 18, 2016

      Hi, I originially posted this via google groups, which didn’t make it
through to the list proper, sorry! Read it here please:
https://groups.google.com/forum/#!topic/python-ideas/V1U6DGL5J1s

My arguments are basically:

   1. f-literals are semantically not strings, but expressions.
   2. Their escape sequences in the code parts are fundamentally both
   detrimental and superfluous (they’re only in for convenience, as confirmed
   by Guido in the quote below)
   3. They’re detrimental because Syntax highlighters are (by design)
   unable to handle this part of Python 3.6a4’s grammar. This will cause code
   to be highlighted as parts of a string and therefore overlooked. i’m very
   sure this will cause bugs.
   4. The fact that people see the embedded expressions as somehow “part of
   the string” is confusing.

My poposal is to redo their grammar:
They shouldn’t be parsed as strings and post-processed, but be their own
thing. This also opens the door to potentially extend to with something
like JavaScript’s tagged templates)

Without the limitations of the string tokenization code/rules, only the
string parts would have escape sequences, and the expression parts would be
regular python code (“holes” in the literal).

Below the mentioned quote and some replies to the original thread:

Guido van Rossum <guido@python.org> schrieb am Mi., 17. Aug. 2016 um
20:11 Uhr:
...
The explanation is honestly that the current approach is the most
straightforward for the implementation (it's pretty hard to intercept the
string literal before escapes have been processed) and nobody cares enough
about the edge cases to force the implementation to jump through more hoops.
I really don't think this discussion should be reopened. If you disagree,
please start a new thread on python-ideas.
I really think it should. Please look at python code with f-literals. if
they’re highlighted as strings throughout, you won’t be able to spot which
parts are code. if they’re highlighted as code, the escaping rules
guarantee that most highlighters can’t correctly highlight python anymore.
i think that’s a big issue for readability.

Brett Cannon <brett@python.org> schrieb am Mi., 17. Aug. 2016 um 20:28 Uhr:
...
They are still strings, there is just post-processing on the string itself
to do the interpolation.
Sounds hacky to me. I’d rather see a proper parser for them, which of
course would make my vision easy.
...
By doing it this way the implementation can use Python itself to do the
tokenizing of the string, while if you do the string interpolation
beforehand you would then need to do it entirely at the C level which is
very messy and painful since you're explicitly avoiding Python's automatic
handling of Unicode, etc.
of course we reuse the tokenization for the string parts. as said, you can
view an f-literal as interleaved sequence of strings and expressions with
an attached format specification.

<f'> starts the f-literal, string contents follow. the only difference to
other strings is
<{> which starts expression tokenization. once the expression ends, an
optional
<formatspec> follows, then a
<}> to switch back to string tokenization
this repeats until (in string parsing mode) a
<'> is encountered which ends the f-literal.

You also make it harder to work with Unicode-based variable names (or at
...
least explain it). If you have Unicode in a variable name but you can't use
\N{} in the string to help express it you then have to say "normal Unicode
support in the string applies everywhere *but* in the string interpolation
part".
i think you’re just proving my point that the way f-literals work now is
confusing.

the embedded expressions are just normal python. the embedded strings just
normal strings. you can simply switch between both using <{> and
<[format]}>.

unicode in variable names works exactly the same as in all other python
code because it is regular python code.

Or another reason is you can explain f-strings as "basically
...
str.format_map(**locals(), **globals()), but without having to make the
actual method call" (and worrying about clashing keys but I couldn't think
of a way of using dict.update() in a single line). But with your desired
change it kills this explanation by saying f-strings aren't like this but
some magical string that does all of this stuff before normal string
normalization occurs.
no, it’s simply the expression parts (that for normal formatting are inside
of the braces of  .format(...)) are *interleaved* in between string parts.
they’re not part of the string. just regular plain python code.

Cheers, and i really hope i’ve made a strong case,
philipp

Let’s make escaping in f-literals impossible

C Anthony Risinger

C Anthony Risinger

C Anthony Risinger

C Anthony Risinger

C Anthony Risinger

C Anthony Risinger

C Anthony Risinger

C Anthony Risinger

tags

participants (20)