[Tutor] regex help
Paul McGuire
ptmcg at austin.rr.com
Mon Feb 23 12:07:18 CET 2009
I second Alan G's appreciation for a well-thought-through and well-conveyed
description of your text processing task. (Is "Alan G" his gangsta name, I
wonder?)
This pyparsing snippet may point you to some easier-to-follow code,
especially once you go beyond the immediate task and do more exhaustive
parsing of your syllable syntax.
from pyparsing import *
LT,GT = map(Suppress,"<>")
lower = oneOf(list(alphas.lower()))
H = Suppress("H")
# have to look ahead to only accept lowers if NOT followed by H
patt = LT + H + ZeroOrMore(lower + ~H)("body") + lower + H + GT
tests = """\
a b c<H d e f gH> h<H i j kH>
a b c<H dH>
a b c<H d eH>""".splitlines()
for t in tests:
print t
print sum((list(p.body)
for p in patt.searchString(t) if p.body), [])
print
Prints:
a b c<H d e f gH> h<H i j kH>
['d', 'e', 'f', 'i', 'j']
a b c<H dH>
[]
a b c<H d eH>
['d']
There is more info on pyparsing at http://pyparsing.wikispaces.com.
-- Paul
More information about the Tutor
mailing list