gmcm at hypernet.com
Sun Apr 23 22:48:40 CEST 2000
Andrew Dalke wrote:
> I'm interested in writing a parser generator which is somewhat
> different than yacc/flex, SPARK, Plex, etc. Those parsers generate
> code which verifies every byte in the file and expects that I'm
> interested in most of the data.
> In my case, I have a lot of data (>100MB) in a known good format
> of which I'm only interested in a few items, but the specific items
> can change.
Can't answer you're direct question, Andrew, but I can tell you
that I've used SPARK without a full grammar (although not on
100MB files). I used an "outer scanner" and an "inner
scanner". The outer just looked for interesting bits. When it
found something, it passed it to the inner for full mastication.
If you need more sophistication, you probably need one where
the parser can reach back into the scanner and change the
rules on the fly. I think Aaron's stuff allows that, but I don't
know for sure.
More information about the Python-list