re question - finiding matching ()
Sun Jan 18 19:21:18 CET 2004
On 18 Jan 2004 07:51:38 -0800, Miki Tebeka wrote:
> Hello All,
> To all of you regexp gurus out there...
> I'd like to find all of the sub strings in the form "add(.*)"
> The catch is that I might have () in the string (e.g. "add((2 * 2),
> Currently I can only get the "addr((2 *2)" using
> re.compile("\w+\([^\)]*\)"). To solve the problem a hand crafted
> search is used :-(
> Is there a better way?
You may need "recursive patterns" to do this but regular expressions
cannot handle this. You can simulate recursive pattern by limiting the
recursivity level. For example the expression inside () should be
[^\(\)]+ at level 0. At level 1, you can match only zero or one pair:
(?:\([^\(\)]+\)|[^\(\)]+)* and so on.
You can build such an expression recursively:
if level < 1:
return r'(?:\(%s\)|%s)*'%(make_par_re(level-1), make_par_re(0))
par_re = re.compile(r"\w+\(%s\)"%make_par_re())
But in this case you are limited to 6 levels.
Now you can try this :
for m in par_re.findall("add((2*2), 100) some text sub(a, b*(10-c),
f(g(a,b), h(c, d)))"):
I don't really like this solution because the expressions are ugly (try
Anyway a better solution would be to use a syntactic parser. You can
write your own by hand or make your choice here:
More information about the Python-list