[pyparsing] How to get arbitrary text surrounded by keywords?
ptmcg at austin.rr._bogus_.com
Mon Nov 28 22:00:58 CET 2005
"Inyeol Lee" <inyeol.lee at siliconimage.com> wrote in message
news:mailman.1297.1133203971.18701.python-list at python.org...
> I'm trying to extract module contents from Verilog, which has the form
> module foo (port1, port2, ... );
> // module contents to extract here.
> To extract the module contents, I'm planning to do something like;
> from pyparsing import *
> ident = Word(alphas+"_", alphanums+"_")
> module_begin = Group("module" + ident + "(" + OneOrMore(ident) + ")" +
> module_contents = ???
> module_end = Keyword("endmodule")
> module = Group(module_begin + module_contents + module_end)
> (abobe code not tested.)
> How should I write the part of 'module_contents'? It's an arbitrary text
> which doesn't contain 'endmodule' keyword. I don't want to use full
> scale Verilog parser for this task.
The simplest way is to use SkipTo. This only works if you don't have to
worry about nesting. I think Verilog supports nested modules, but if the
files you are parsing don't use this feature, then SkipTo will work just
module_begin = Group("module" + ident + "(" + OneOrMore(ident) + ")" + ";")
module_end = Keyword("endmodule")
module_contents = SkipTo(module_end)
If you *do* care about nested modules, then a parse action might help you
handle these cases. But this starts to get trickier, and you may just want
to consider a more complete grammar. If your application is non-commercial
(i.e., for academic or personal use), there *is* a full Verilog grammar
available (also available with commercial license, just not free).
More information about the Python-list