How can we get to the end of a quote inside a string

Paul McGuire ptmcg at austin.rr.com
Tue Sep 2 10:51:53 EDT 2008


On Aug 31, 9:29 am, rajmoha... at gmail.com wrote:
> Hi all,
>     Suppose I have a string which contains quotes inside quotes -
> single and double quotes interchangeably -
>  s = "a1' b1 " c1' d1 ' c2" b2 'a2"
>      I need to start at b1 and end at b2 - i.e. I have to parse the
> single quote strings from inside s.
>

Pyparsing defines a helper method called nestedExpr - typically it is
used to find nesting of ()'s, or []'s, etc., but I was interested to
see if I could use nestedExpr to match nested ()'s, []'s, AND {}'s all
in the same string (like we used to do in our algebra class to show
nesting of higher levels than parens - something like "{[a + 3*(b-c)]
+ 7}" - that is, ()'s nest within []'s, and []'s nest within {}'s).
This IS possible, but it uses some advanced pyparsing methods.  I
adapted this example to map to your case - this was much simpler, as
""s nest within ''s, and ''s nest within ""s.  I still keep a stack of
previous nesting, but I'm not sure this was absolutely necessary.
Here is the working code with your example:

from pyparsing import Forward, oneOf, NoMatch, Literal, CharsNotIn,
nestedExpr

# define special subclass of Forward, that saves previous contained
# expressions in a stack
class ForwardStack(Forward):
    def __init__(self):
        super(ForwardStack,self).__init__()
        self.exprStack = []
        self << NoMatch()
    def __lshift__(self,expr):
        self.exprStack.append(self.expr)
        super(ForwardStack,self).__lshift__(expr)
        return self
    def pop(self):
        self.expr = self.exprStack.pop()

# define the grammar
opening = ForwardStack()
closing = ForwardStack()
opening << oneOf(["'", '"'])
closing << NoMatch()
matchedNesting = nestedExpr(opening, closing, CharsNotIn('\'"'),
ignoreExpr=None)

# define parse-time callbacks
alternate = {'"':"'", "'":'"'}
def pushAlternate(t):
    # closing expression should match the current opening quote char
    closing << Literal( t[0] )
    # if we find the other opening quote char, it is the beginning of
    # a nested quote
    opening << Literal( alternate[ t[0] ] )
def popClosing():
    closing.pop()
    opening.pop()
# when these expressions match, the parse action will be called
opening.setParseAction(pushAlternate)
closing.setParseAction(popClosing)

# parse the test string
s = """ "a1' b1 " c1' d1 ' c2" b2 'a2" """

print matchedNesting.parseString(s)[0]


Prints:

['a1', [' b1 ', [' c1', [' d1 '], ' c2'], ' b2 '], 'a2']


-- Paul





More information about the Python-list mailing list