pyparser and recursion problem

pyscottishguy at pyscottishguy at
Thu Jul 26 21:50:08 CEST 2007


Using pyparser, I'm trying to parse a string like this:

:Start: first SECOND THIRD :SECOND: second1 | second2 :THIRD: third1 |
FOURTH  :FOURTH: fourth1 | fourth2

I want the parser to do the following:
1) Get the text for the :Start: label   e.g ('first SECOND THIRD')
2) Do nothing with the lower-case words  e.g ('first')
3) For each upper-case word find the corresponding entries, and
replace the word
    with these entries (the '|' indicates separate records)
    e.g. for 'SECOND', replace the word with ("second1", "second2")
4 Do this recursively, because each item in '3' can have upper-case

I can do this - but not within pyparser.  I had to write a recursive
function to do it.  I would like to do it within pyparser however.

I'm pretty sure I have to use the Forward() function along with a few
setResultsName() - but after reading the documentation, many examples,
and trying for hours, I'm still totally lost.  Please help!

Here is the program I have so far:

from pyparsing import Word, Optional, OneOrMore, Group,  alphas,
alphanums, Suppress, Dict

import string

def allIn( as, members ):
    "Tests that all elements of as are in members"""
    for a in as:
        if a not in members:
            return False
    return True

def allUpper( as ):
    """Tests that all strings in as are uppercase"""
    return allIn( as, string.uppercase )

def getItems(myArray, myDict):
    """Recursively get the items for each CAPITAL word"""
    for element in myArray:
        for word in element:
            if allUpper(word):
                items = getItems(myDict[word], myDict)

    return myElements

testData = """
:Start: first SECOND THIRD  fourth FIFTH

:SECOND: second1_1 second1_2 | second2 | second3

:THIRD: third1 third2 | SIXTH

:FIFTH: fifth1 | SEVENTH

:SIXTH: sixth1_1 sixth1_2 | sixth2

:SEVENTH: EIGHTH | seventh1

:EIGHTH: eighth1 | eighth2


label = Suppress(":") + Word(alphas + "_") + Suppress(":")

words = Group(OneOrMore(Word(alphanums + "_"))) +

data = ~label + OneOrMore(words)

line = Group(label + data)

doc = Dict(OneOrMore(line))

res = doc.parseString(testData)

# This prints out what pyparser gives us
for line in res:
    print line


startString = res["Start"]
items = getItems([startString], res)[0]
# This prints out what we want
for line in items:
    print line

More information about the Python-list mailing list