On Wed, Jul 24, 2019 at 6:40 PM Eric V. Smith email@example.com wrote:
On 7/24/2019 9:15 PM, Nam Nguyen wrote:
Back to my original requests to the list: 1) Whether we want to have a (possibly private) parsing library in the stdlib, and 2) What features it should have. I have proposed that 1) yes, such a library would be useful, and 2) several requirements that such a library should fulfill.
Are they acceptable? Apparently not. As you first requested for debug trace, Barry wanted better performance, and Chris asked for proof such library would help. How about the other points I suggested? Do we need a full-blown universal parser? Is LL(1) enough, or do we need k / unlimited lookaheads? How about context sensitive grammars? What performance budget can we spend on this vs the status quo? Do we even care if the parser is small, or that it comes from generated code? Et cetera... There are still plenty of open questions.
I would think deciding on LL(1), or a specific k lookahead, or any other parser feature would depend on which problems you're trying to solve.
I think you're looking at this backwards. You seem to be saying "here's a parser, now solve problems in the stdlib with it".
Yes. I started out with the assumption that a parser library can certainly help with code in the stdlib because I have seen, or got to know of problems (though not *all*) in the stdlib that such library could solve. This is where collective wisdom could help.
For something in the stdlib, I think it has to be "here are the problems to be solved, now I'll design a parser to solve them".
I gave links to CVEs and bugs in BPO. Those are illustrative of problems to be solved. If you know of other places where a parser can help, let's hear it.
To take but one example: what if it turns out that a URL can't be parsed with a LL(1) parser, and you need more horsepower? Don't you think you'd need to know that in advance of proposing a LL(1) parser for the stdlib?
And herein lies an issue: you can't design a parser that solves every issue that the stdlib will ever have.
Not looking nor dreaming of one such ;). But with educated guesses, we could have a set of requirements for some reasonable time. For e.g. we expect to deal mostly with context-free grammars, so anything that can help parse all of them is an acceptable solution.
At best you can only solve the existing problems. But if you do that, what do you do if we want to add something new to the stdlib that your parser doesn't have enough power to parse?
Accept that fact, and work on solutions then. I mean, no one can wholly foresee the future. If a need arises that isn't accommodated by existing tools, well, we invent new tools. I don't see that as impediments to making improvements now.
To try and be more concretely helpful: at the very least, I think you should propose a specific set of things in the existing stdlib that would benefit from a parser generator, instead of the existing ad hoc parsers being used. Bonus points for showing how the code is simplified and/or made more secure by using a parser generator.
The majority part of this thread was about how my PoC could help with urlparse. I'm sorry that the snipping has cut off contexts but you'll be able to find previous exchanges in the archive.