On Jul 16, 2019, at 21:30, Nam Nguyen <bitsink@gmail.com> wrote:

Can you add failure handling without breaking the “~200LOC and easy to read” feature of the library, and without breaking the “easy to read once you grok parser combinators” feature of the parsers built with it?
This is a good request. I will have to play around with this idea more. What I think could be the most challenging task is to attribute failure to appropriate rule(s) (i.e. expr expects a term + term, but you only have term +). I feel like some metadata about the grammar might be required here, and that might be too unwieldy to provide in a parser combinator formulation.

For what it’s worth, the only time I played with parser combinators in anger, something like 90% of the final code was for tracking the source position and tree path and other state. I’m hoping you can come up with something more clever than we did, so your 200 lines of pretty code doesn’t turn into 2000 lines of ugly spaghetti.

Interestingly enough, regex doesn't have anything like this either.

Sure, but regex is meant to be dumb, restricted, and fast. Something that’s meant to parse arbitrary, arbitrarily-structured languages needs more failure handling.

Also, regex does have (some) handing for errors in the regex, as opposed to parse failures, and I think your library will probably need at least that much. And, without the separate “compile” and “eval” stages that most regex libraries have, I don’t think you can really separate errors from failure the same way, so you kind of need failure handling just for debugging parsers.

Of course I may be wrong in thinking that this is the more important limitation, rather than performance. But it’s also the more fun limitation to solve, right? :)