need some regular expression help
Diez B. Roggisch
deets at nospam.web.de
Sun Oct 8 08:26:11 EDT 2006
Mirco Wahab schrieb:
> Thus spoke Diez B. Roggisch (on 2006-10-08 10:49):
>> Certainly true, and it always gives me a hard time because I don't know
>> to which extend a regular expression nowadays might do the job because
>> of these extensions. It was so much easier back in the old times....
>
> Right, in perl, this would be a no-brainer,
> its documented all over the place, like:
>
> my $re;
>
> $re = qr{
> (?:
> (?> [^\\()]+ | \\. )
> |
> \( (??{ $re }) \)
> )*
> }xs;
>
> where you have a 'delayed execution'
> of the
>
> (??{ $re })
>
> which in the end makes the whole a thing
> recursive one, it gets expanded and
> executed if the match finds its way
> to it.
>
> Above regex will match balanced parens,
> as in:
>
> my $good = 'a + (b / (c - 2)) * (d ^ (e+f)) ';
> my $bad1 = 'a + (b / (c - 2) * (d ^ (e+f)) ';
> my $bad2 = 'a + (b / (c - 2)) * (d) ^ (e+f) )';
>
> if you do:
>
> print "ok \n" if $good =~ /^$re$/;
> print "ok \n" if $bad1 =~ /^$re$/;
> print "ok \n" if $bad2 =~ /^$re$/;
>
>
> This in some depth documented e.g. in
> http://japhy.perlmonk.org/articles/tpj/2004-summer.html
> (topic: Recursive Regexes)
That clearly is a recursive grammar rule, and thus it can't be regular
anymore :) But first of all, I find it ugly - the clean separation of
lexical and syntactical analysis is better here, IMHO - and secondly,
what are the properties of that parsing? Is it LL(k), LR(k), backtracking?
Diez
More information about the Python-list
mailing list