n 8/10/2019 7:46 PM, Glenn Linderman wrote:
Because of the "invalid escape sequence" and "raw string" discussion, when looking at the documentation, I also noticed the following description for f-strings:

Escape sequences are decoded like in ordinary string literals (except when a literal is also marked as a raw string). After decoding, the grammar for the contents of the string is:
followed by lots of stuff, followed by
Backslashes are not allowed in format expressions and will raise an error:
f"newline: {ord('\n')}"  # raises SyntaxError

What I don't understand is how, if f-strings are processed AS DESCRIBED, how the \n is ever seen by the format expression.
If I recall correctly, the mentioned decoding is happening on the string literal parts of the f-strings (above, the "newline: " part), not the expression parts (inside the {}). But it's been a while and I don't recall all of the details.

The description is that they are first decoded like ordinary strings, and then parsed for the internal grammar containing {} expressions to be expanded.  If that were true, the \n in the above example would already be a newline character, and the parsing of the format expression would not see the backslash. And if it were true, that would actually be far more useful for this situation.

So given that it is not true, why not? And why go to the extra work of prohibiting \ in the format expressions?

It's a future-proofing thing. See the discussion at https://mail.python.org/archives/list/python-dev@python.org/thread/EVXD72IYUN2APF2443OMADKA5WJTOKHD/ It has pointers to other parts of the discussion.

At some point, I'm planning on switching the parsing of f-strings from the custom parser (see Python/ast.c, FstringParser_ConcatFstring()) to having the python parser itself parse the f-strings. This will be similar to PEP 536, which doesn't have much detail, but does describe some of the motivations.


The PEP 498, of course, has an apparently more accurate description, that the {} parsing actually happens before the escape processing. Perhaps this avoids making multiple passes over the string to do the work, as the literal pieces and format expression pieces have to be separate in the generated code, but that is just my speculation: I'd like to know the real reason.

Should the documentation be fixed to make the description more accurate? If so, I'd be glad to open an issue.

Sure. I'm always in favor of accuracy. The f-string documentation was a last-minute rush job that could have used a lot more editing, and more eyes are always welcome.

But it will take a fair amount of research to understand it well enough to document it in more detail.


The PEP further contains the inaccurate statement:

Like all raw strings in Python, no escape processing is done for raw f-strings:

not mentioning the actual escape processing that is done for raw strings, regarding \" and \'.

It should probably just say it uses the same rules as raw strings.

Eric