# [Tutor] Help with re module maybe?

Kent Johnson kent37 at tds.net
Sat Nov 20 04:39:29 CET 2004

```In general it is hard to use regular expressions to parse when you have
to keep some state to know what to do. In this case you have to keep
track of the nesting of the parens in order to know how to handle a plus
sign.

This is a simple enough problem that a simple state machine that counts
open and close parentheses does the job. For more complicated state a
parsing library like pyparsing is very helpful.

I encourage you to learn about the re module, though. It is often handy.
http://www.amk.ca/python/howto/regex/

Kent

Some people, when confronted with a problem, think “I know, I’ll use
regular expressions.” Now they have two problems.
--Jamie Zawinski, in comp.emacs.xemacs

def splitter(s):
''' Split a string on each plus sign that is not inside parentheses '''

parenCount = 0  # Counts current nesting
start = 0       # Start of current run

for i, c in enumerate(s):
if c == '+' and parenCount == 0:
yield s[start:i]
start = i+1
elif c == '(':
parenCount += 1
elif c == ')':
parenCount -= 1

# Yield any leftovers
if start < len(s):
yield s[start:]

test = [
"x**2+sin(x**2+2*x)",
'abcd',
'(a+b)+(c+d)',
'((a+b)+(c+d))+e'
]

for s in test:
print s, list(splitter(s))

prints:
x**2+sin(x**2+2*x) ['x**2', 'sin(x**2+2*x)']
abcd ['abcd']
(a+b)+(c+d) ['(a+b)', '(c+d)']
((a+b)+(c+d))+e ['((a+b)+(c+d))', 'e']

Jacob S. wrote:
> Okay,
>
>     say I have a string "x**2+sin(x**2+2*x)" and I want to split it at the
> addition sign. However, I don't want to split it at the addition signs
> inside the parenthesis. How do I go about doing this?
>
> Goes something along the lines
>
> a = a.split("+") # if and only if + not inside parenthesis
>
> That should be enough information to help...
> I think the re module might help. Any insights as to a good walkthrough of
> the re module would be helpful. If you have any suggestions, or would like
> to give me a more detailed way to do this (maybe a specific place in the re
> module?)
>