Python-based regular expression parser that allows patterns to call functions?

Paul McGuire ptmcg at austin.rr.com
Sun Mar 2 17:53:47 CET 2008


On Mar 2, 8:41 am, Andrew Warkentin <andr... at datanet.ab.ca> wrote:
> I am writing a filtering HTTP proxy (the site ishttp://xuproxy.sourceforge.net/). I want it to be compatible with
> Proxomitron (http://proxomitron.info/) filters. I need a regular
> expression parser that allows patterns to call functions (or more
> likely, class methods), to implement "matching commands" (look at the
> Proxmitron documentation to see what I mean). Does anyone know if such a
> library exists for Python, or do I have to write my own parser?

Andrew -

Pyparsing allows you to define parse actions that get called when
element within a grammar are matched.  These actions can update
external data structures, modify the matched text, or can be used to
provide additional semantic validation.  Here's an example:

from pyparsing import *

integer = Regex(r"\b\d+\b")
# could also be written as
#~ integer = WordStart() + Word(nums) + WordEnd()

# convert matched text to actual integer
def cvt_to_int (tokens):
    return int(tokens[0])

# only accept integers < 100
def must_be_less_than_100(tokens):
    if (tokens[0]) >= 100:
        raise ParseException("only integers < 100 are allowed")

# add value to running tally of matches
def increment_tally(tokens):
    global running_total
    running_total += tokens[0]

integer.setParseAction( cvt_to_int)
integer.addParseAction( must_be_less_than_100 )
integer.addParseAction( increment_tally )

# could also be written as
#~ integer.setParseAction( cvt_to_int,
    #~ must_be_less_than_100,
    #~ increment_tally )

running_total = 0
print integer.searchString("absdlkj 1 5 12 121 78 22")
print running_total

Prints:

[[1], [5], [12], [78], [22]]
118

More info about pyparsing at http://pyparsing.wikispaces.com, plus
more examples, and links to other doc sources.

-- Paul



More information about the Python-list mailing list