[Python-Dev] Parsing f-strings from PEP 498 -- Literal String Interpolation

Fri Nov 4 13:15:07 EDT 2016

On 11/4/2016 10:50 AM, Fabio Zadrozny wrote:
>     In what way do you think the implementation isn't ready for a final
>     release?
>
>
> Well, the cases listed in the docs
> (https://hg.python.org/cpython/file/default/Doc/reference/lexical_analysis.rst
> <https://hg.python.org/cpython/file/default/Doc/reference/lexical_analysis.rst>)
> don't work in the latest release (with SyntaxErrors) -- and the bug I
> created related to it: http://bugs.python.org/issue28597
> <http://bugs.python.org/issue28597> was promptly closed as duplicate
> -- so, I assumed (maybe wrongly?) that the parsing still needs work.

It's not the parsing that needs work, it's the documentation. Those 
examples used to work, but the parser was deliberately changed to not 
support them. There's a long discussion on python-ideas about it, 
starting at 
https://mail.python.org/pipermail/python-ideas/2016-August/041727.html

> It'd be nice if at least this description could be added to the PEP (as
> all other language implementations and IDEs will have to work the same
> way and will probably reference it) -- a grammar example, even if not
> used would be helpful (personally, I think hand-crafted parsers are
> always worse in the long run compared to having a proper grammar with a
> parser, although I understand that if you're not really used to it, it
> may be more work to set it up).

I've written a parser generator just to understand how they work, so I'm 
completely sympathetic to this. However, in this case, I don't think it 
would be any easier. I'm basically writing a tokenizer, not an 
expression parser. It's much simpler. The actual parsing is handled by 
PyParser_ASTFromString. And as I state below, you have to also consider 
the parser consumers.

> Also, I find it a bit troubling that PyParser_ASTFromString is used
> there and not just the node which would be related to an expression,
> although I understand it's probably an easier approach, although in the
> end you probably have to filter it and end up just accepting what's
> beneath the "test" from the grammar, no? (i.e.: that's what a lambda
> body accepts).

Using PyParser_ASTFromString is the easiest possible way to do this. 
Given a string, it returns an AST node. What could be simpler?

> Well, I think all language implementations / IDEs (or at least those
> which want to give syntax errors) will *have* to look inside f-strings.

While it's probably true that IDEs (and definitely language 
implementations) will want to parse f-strings, I think there are many 
more code scanners that are not language implementations or IDEs. And by 
being "just" regular strings with a new prefix, it's trivial to get any 
parser that doesn't care about the internal structure to at least see 
f-strings as normal strings.

> Also, you could still have a separate grammar saying how to look inside
> f-strings (this would make the lives of other implementors easier) even
> if it was a post-processing step as you're doing now.

Yes. I've contemplated exposing the f-string scanner. That's the part 
that returns expressions (as strings) and literal strings. I realize 
that won't help 3.6.

Eric.