regexp non-greedy matching bug?
John Hazen
john at hazen.net
Sun Dec 4 02:31:43 EST 2005
> [John Hazen]
> > I want to match one or two instances of a pattern in a string.
> >
> > >>> s = 'foobarbazfoobar'
> > >>> foofoo = re.compile(r'^(foo)(.*?)(foo)?(.*?)$')
> > >>> foofoo.match(s).group(1)
> > 'foo'
> > >>> foofoo.match(s).group(3)
> > >>>
[Tim Peters]
> Your problem isn't that
>
> (foo)?
>
> is not greedy (it is greedy), it's that your first
>
> (.*?)
>
> is not greedy. Remember that regexps also work left to right.
Well, I had the same symptoms when that .* was greedy (it ate up the
optional foo), which is why I went to non-greedy there.
I guess my error was thinking that greedy trumped non-greedy, rather
than left trumping right. (ie, in order for the (foo)? to be maximally
greedy, the (.*?) has to be non-maximally non-greedy :)
> Maybe what you're looking for is
>
> ^P(.*P)?.*$
Yes. That works the way I wanted. ( ^(foo)(.*(foo))?.*$ )
Thank you, both for the specific answer, and the general education.
-John
More information about the Python-list
mailing list