Reg Exp: Need advice concerning "greediness"

Franz GEIGER fgeiger at datec.at
Wed Oct 4 09:57:30 EDT 2000


> I used a negated character class to force an end for the first group
before
> a cpossible COLOR tag.  Otherwise, what I think is happening is that your

That did the trick.

> is included into it. BTW, I changed that '*' to '?', which is what you
meant,
> if I read correctly.

Yes.

As fascinating reg exp are, they are not always easy to understand and use,
especially for newbies.

Thanks a lot and
best regards

Franz

Calvelo Daniel <dcalvelo at pharion.univ-lille2.fr> schrieb in im Newsbeitrag:
8r9nc3$8re$1 at netserv.univ-lille1.fr...
> Franz GEIGER <fgeiger at datec.at> wrote:
> : Hello all,
>
> : I want to exchange font colors of headings of a certain level in HTML
files.
>
> : I have a line containing a heading level 1, e.g.: <h1><font
> : COLOR="#FF0000">Heading Level 1</font></h1>.
>
> : Now I want to split this into 3 groups: Everything before "COLOR=xyz",
> : "COLOR=xyz" itself, and everything after "COLOR=xyz".
>
> : I tried:
> : sRslt = "<h1><font COLOR="#FF0000">Heading Level 1</font></h1>";
> : print re.findall(re.compile(r'(.*?FONT.*?)(COLOR=.*?)*([ |>].*)', re.I |
> : re.S), sRslt);
>
> Beware of quotes in your example:
>
> >>> sRslt = "<h1><font COLOR="#FF0000">Heading Level 1</font></h1>"
> >>> sRslt
> '<h1><font COLOR='
>
> (That explains weird results reported here)
>
> As for your regexp, the following works:
>
> >>> print re.findall(re.compile(r'(.*?FONT[^">]+?)(COLOR=.*?)?([ |>].*)',
re.I | re.S), sRslt);
> [('<h1><font ', 'COLOR="#FF0000"', '>Heading Level 1</font></h1>')]
>
> I used a negated character class to force an end for the first group
before
> a cpossible COLOR tag.  Otherwise, what I think is happening is that your
> non-greedy search is indeed non-greedy, but the null-match of
'(COLOR=.*?)*'
> is included into it. BTW, I changed that '*' to '?', which is what you
meant,
> if I read correctly.
>
> HTH, DCA
>
> -- Daniel Calvelo Aros
>      calvelo at lifl.fr







More information about the Python-list mailing list