On Wed, Jul 24, 2019 at 6:40 PM Eric V. Smith <eric@trueblade.com> wrote:
On 7/24/2019 9:15 PM, Nam Nguyen wrote:
> Back to my original requests to the list: 1) Whether we want to have a
> (possibly private) parsing library in the stdlib, and 2) What features
> it should have. I have proposed that 1) yes, such a library would be
> useful, and 2) several requirements that such a library should fulfill.
> Are they acceptable? Apparently not. As you first requested for debug
> trace, Barry wanted better performance, and Chris asked for proof such
> library would help. How about the other points I suggested? Do we need a
> full-blown universal parser? Is LL(1) enough, or do we need k /
> unlimited lookaheads? How about context sensitive grammars? What
> performance budget can we spend on this vs the status quo? Do we even
> care if the parser is small, or that it comes from generated code? Et
> cetera... There are still plenty of open questions.

I would think deciding on LL(1), or a specific k lookahead, or any other
parser feature would depend on which problems you're trying to solve.

I think you're looking at this backwards. You seem to be saying "here's
a parser, now solve problems in the stdlib with it".

Yes. I started out with the assumption that a parser library can certainly help with code in the stdlib because I have seen, or got to know of problems (though not *all*) in the stdlib that such library could solve. This is where collective wisdom could help.
For something in
the stdlib, I think it has to be "here are the problems to be solved,
now I'll design a parser to solve them".

I gave links to CVEs and bugs in BPO. Those are illustrative of problems to be solved. If you know of other places where a parser can help, let's hear it.

To take but one example: what if it turns out that a URL can't be parsed
with a LL(1) parser, and you need more horsepower? Don't you think you'd
need to know that in advance of proposing a LL(1) parser for the stdlib?

And herein lies an issue: you can't design a parser that solves every
issue that the stdlib will ever have.

Not looking nor dreaming of one such ;). But with educated guesses, we could have a set of requirements for some reasonable time. For e.g. we expect to deal mostly with context-free grammars, so anything that can help parse all of them is an acceptable solution.
At best you can only solve the
existing problems. But if you do that, what do you do if we want to add
something new to the stdlib that your parser doesn't have enough power
to parse?

Accept that fact, and work on solutions then. I mean, no one can wholly foresee the future. If a need arises that isn't accommodated by existing tools, well, we invent new tools. I don't see that as impediments to making improvements now.

To try and be more concretely helpful: at the very least, I think you
should propose a specific set of things in the existing stdlib that
would benefit from a parser generator, instead of the existing ad hoc
parsers being used. Bonus points for showing how the code is simplified
and/or made more secure by using a parser generator.

The majority part of this thread was about how my PoC could help with urlparse. I'm sorry that the snipping has cut off contexts but you'll be able to find previous exchanges in the archive.