[Tutor] Help with re module maybe?

Jacob S. keridee at jayco.net
Mon Nov 22 01:00:27 CET 2004


Thank you Kent!

Your function is nifty. I haven't fit it into my situation yet, but I like
already!

Jacob Schmidt

Kent Johnson said one fine day...
> In general it is hard to use regular expressions to parse when you have
> to keep some state to know what to do. In this case you have to keep
> track of the nesting of the parens in order to know how to handle a plus
> sign.
>
> This is a simple enough problem that a simple state machine that counts
> open and close parentheses does the job. For more complicated state a
> parsing library like pyparsing is very helpful.
>
> I encourage you to learn about the re module, though. It is often handy.
> This HOW-TO might help you get started:
> http://www.amk.ca/python/howto/regex/
>
> Kent
>
> Some people, when confronted with a problem, think “I know, I’ll use
> regular expressions.” Now they have two problems.
> --Jamie Zawinski, in comp.emacs.xemacs
>
>
> def splitter(s):
>      ''' Split a string on each plus sign that is not inside parentheses
'''
>
>      parenCount = 0  # Counts current nesting
>      start = 0       # Start of current run
>
>      for i, c in enumerate(s):
>          if c == '+' and parenCount == 0:
>              yield s[start:i]
>              start = i+1
>          elif c == '(':
>              parenCount += 1
>          elif c == ')':
>              parenCount -= 1
>
>      # Yield any leftovers
>      if start < len(s):
>          yield s[start:]
>
> test = [
>      "x**2+sin(x**2+2*x)",
>      'abcd',
>      '(a+b)+(c+d)',
>      '((a+b)+(c+d))+e'
> ]
>
> for s in test:
>      print s, list(splitter(s))
>
>
> prints:
> x**2+sin(x**2+2*x) ['x**2', 'sin(x**2+2*x)']
> abcd ['abcd']
> (a+b)+(c+d) ['(a+b)', '(c+d)']
> ((a+b)+(c+d))+e ['((a+b)+(c+d))', 'e']
>
>
>
> Jacob S. wrote:
> > Okay,
> >
> >     say I have a string "x**2+sin(x**2+2*x)" and I want to split it at
the
> > addition sign. However, I don't want to split it at the addition signs
> > inside the parenthesis. How do I go about doing this?
> >
> > Goes something along the lines
> >
> > a = a.split("+") # if and only if + not inside parenthesis
> >
> > That should be enough information to help...
> > I think the re module might help. Any insights as to a good walkthrough
of
> > the re module would be helpful. If you have any suggestions, or would
like
> > to give me a more detailed way to do this (maybe a specific place in the
re
> > module?)
> >
> > Thanks in advance,
> > Jacob Schmidt
> >
> > _______________________________________________
> > Tutor maillist  -  Tutor at python.org
> > http://mail.python.org/mailman/listinfo/tutor
> >
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
>
>



More information about the Tutor mailing list