[Tutor] stopping greedy matches
cyresse at gmail.com
Wed Mar 16 23:36:05 CET 2005
> >>> x=re.compile(r"(?<=\bin).+\b")
>>> x = re.compile("in (.*?)\b")
.*? is a non-greedy matcher I believe.
Are you using python24/tools/scripts/redemo.py? Use that to test regexes.
On Wed, 16 Mar 2005 12:12:32 -0800, Mike Hall
<michael.hall at critterpixstudios.com> wrote:
> I'm having trouble getting re to stop matching after it's consumed what
> I want it to. Using this string as an example, the goal is to match
> >>> s = "only the word in CAPS should be matched"
> So let's say I want to specify when to begin my pattern by using a
> >>> x = re.compile(r"(?<=\bin)") #this will simply match the spot in
> front of "in"
> So that's straight forward, but let's say I don't want to use a
> lookahead to specify the end of my pattern, I simply want it to stop
> after it has combed over the word following "in". I would expect this
> to work, but it doesn't:
> >>> x=re.compile(r"(?<=\bin).+\b") #this will consume everything past
> "in" all the way to the end of the string
> In the above example I would think that the word boundary flag "\b"
> would indicate a stopping point. Is ".+\b" not saying, "keep matching
> characters until a word boundary has been reached"?
> Even stranger are the results I get from:
> >>> x=re.compile(r"(?<=\bin).+\s") #keep matching characters until a
> whitespace has been reached(?)
> >>> r = x.sub("!@!", s)
> >>> print r
> only the word in!@!matched
> For some reason there it's decided to consume three words instead of
> My question is simply this: after specifying a start point, how do I
> make a match stop after it has found one word, and one word only? As
> always, all help is appreciated.
> Tutor maillist - Tutor at python.org
'There is only one basic human right, and that is to do as you damn well please.
And with it comes the only basic human duty, to take the consequences.
More information about the Tutor