help with pyparsing

Prabhu Gurumurthy pgurumur at gmail.com
Mon Dec 10 11:04:32 EST 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Paul McGuire wrote:
> On Dec 9, 11:01 pm, Prabhu Gurumurthy <pguru... at gmail.com> wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> All,
>>
>> I have the following lines that I would like to parse in python using
>> pyparsing, but have some problems forming the grammar.
>>
>> Line in file:
>> table <ALINK> const { 207.135.103.128/26, 207.135.112.64/29 }
>> table <INTRANET> persist { ! 10.200.2/24, 10.200/22 }
>> table <RFC_1918> const { 192.168/16, ! 172.24.1/29, 172.16/12, 169.254/16 }
>> table <DIALER> persist { 10.202/22 }
>> table <RAVPN> const { 10.206/22 }
>> table <KS> const {   \
>>    10.205.1/24,      \
>>    169.136.241.68,   \
>>    169.136.241.70,   \
>>    169.136.241.71,   \
>>    169.136.241.72,   \
>>    169.136.241.75,   \
>>    169.136.241.76,   \
>>    169.136.241.77,   \
>>    169.136.241.78,   \
>>    169.136.241.79,   \
>>    169.136.241.81,   \
>>    169.136.241.82,   \
>>    169.136.241.85 }
>>
>> I have the following grammar defn.
>>
>> tableName = Word(alphanums + "-" + "_")
>> leftClose = Suppress("<")
>> rightClose = Suppress(">")
>> key = Suppress("table")
>> tableType = Regex("persist|const")
>> ip4Address = OneOrMore(Word(nums + "."))
>> ip4Network = Group(ip4Address + Optional(Word("/") +
>> OneOrMore(Word(nums))))
>> temp = ZeroOrMore("\\" + "\n")
>> tableList = OneOrMore(Optional("\\") |
>>                ip4Network | ip4Address | Suppress(",") | Literal("!"))
>> leftParen = Suppress("{")
>> rightParen = Suppress("}")
>>
>> table = key + leftClose + tableName + rightClose + tableType + \
>>                   leftParen + tableList + rightParen
>>
>> I cannot seem to match sixth line in the file above, i.e table name with
>> KS, how do I form the grammar for it, BTW, I still cannot seem to ignore
>> comments using table.ignore(Literal("#") + restOfLine), I get a parse error.
>>
>> Any help appreciated.
>> Thanks
>> Prabhu
> 
> Prabhu -
> 
> This is a good start, but here are some suggestions:
> 
> 1. ip4Address = OneOrMore(Word(nums + "."))
> 
> Word(nums+".") will read any contiguous set of characters in the
> string nums+".", so OneOrMore is not necessary for reading in an
> ip4Address.  Just use:
> 
> ip4Address = Word(nums + ".")
> 
> 
> 2. ip4Network = Group(ip4Address + Optional(Word("/") +
> OneOrMore(Word(nums))))
> 
> Same comment, OneOrMore is not needed for the added value to the
> ip4Address:
> 
> ip4Network = Group(ip4Address + Optional(Word("/") + Word(nums))))
> 
> 
> 3. tableList = OneOrMore(Optional("\\") |
>                ip4Network | ip4Address | Suppress(",") |
> Literal("!"))
> 
> The list of ip4Networks is just a comma-delimited list, with some
> entries preceded with a '!' character.  It is simpler to use
> pyparsing's built-in helper, delimitedList, as in:
> 
> tableList = Group( delimitedList(Group("!"+ip4Network)|ip4Network) )
> 
> 
> Yes, I know, you are saying, "but what about all those backslashes?"
> The backslashes look like they are just there as line continuations.
> We can define an ignore expression, so that the table expression, and
> all of its contained expressions, will ignore '\' characters as line
> continuations:
> 
> table.ignore( Literal("\\") + LineEnd() )
> 
> And I'm not sure why you had trouble with ignoring '#' + restOfLine,
> it works fine in the program below.
> 
> If you make these changes, your program will look something like this:
> 
> tableName = Word(alphanums + "-" + "_")
> leftClose = Suppress("<")
> rightClose = Suppress(">")
> key = Suppress("table")
> tableType = Regex("persist|const")
> ip4Address = Word(nums + ".")
> ip4Network = Group(ip4Address + Optional(Word("/") + Word(nums)))
> tableList = Group(delimitedList(Group("!"+ip4Network)|ip4Network))
> leftParen = Suppress("{")
> rightParen = Suppress("}")
> 
> table = key + leftClose + tableName + rightClose + tableType + \
>                   leftParen + tableList + rightParen
> table.ignore(Literal("\\") + LineEnd())
> table.ignore(Literal("#") + restOfLine)
> 
> # parse the input line, and pprint the results
> result = OneOrMore(table).parseString(line)
> from pprint import pprint
> pprint(result.asList())
> 
> Prints out:
> ['ALINK',
>  'const',
>  [['207.135.103.128', '/', '26'], ['207.135.112.64', '/', '29']],
>  'INTRANET',
>  'persist',
>  [['!', ['10.200.2', '/', '24']], ['10.200', '/', '22']],
>  'RFC_1918',
>  'const',
>  [['192.168', '/', '16'],
>   ['!', ['172.24.1', '/', '29']],
>   ['172.16', '/', '12'],
>   ['169.254', '/', '16']],
>  'DIALER',
>  'persist',
>  [['10.202', '/', '22']],
>  'RAVPN',
>  'const',
>  [['10.206', '/', '22']],
>  'KS',
>  'const',
>  [['10.205.1', '/', '24'],
>   ['169.136.241.68'],
>   ['169.136.241.70'],
>   ['169.136.241.71'],
>   ['169.136.241.72'],
>   ['169.136.241.75'],
>   ['169.136.241.76'],
>   ['169.136.241.77'],
>   ['169.136.241.78'],
>   ['169.136.241.79'],
>   ['169.136.241.81'],
>   ['169.136.241.82'],
>   ['169.136.241.85']]]
> 
> -- Paul

Awesome, thanks a lot will try it today and will let you know how it
proceeds.

thanks again.
Prabhu
- -
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4-svn0 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFHXWOQTkjpaeKzB9YRAq/aAJ9b0uocbP+1XxIVj4LgS76uFEuQHwCgxojY
zv05Raaj5McSEzDWXiSxf9c=
=MMFV
-----END PGP SIGNATURE-----



More information about the Python-list mailing list