[Tutor] stopping greedy matches

Mike Hall michael.hall at critterpixstudios.com
Thu Mar 17 01:21:27 CET 2005


Liam, "re.compile("in (.*?)\b")" will not find any match in the example 
string I provided. I have had little luck with these non-greedy 
matchers.

I don't appear to have redemo.py on my system (on OSX), as an import 
returns an error. I will look into finding this module, thanks for 
pointing me towards it :)


On Mar 16, 2005, at 2:36 PM, Liam Clarke wrote:

>>>>> x=re.compile(r"(?<=\bin).+\b")
>
> Try
>
>>>> x = re.compile("in (.*?)\b")
>
> .*? is a non-greedy matcher I believe.
>
> Are you using python24/tools/scripts/redemo.py? Use that to test 
> regexes.
>
> Regards,
>
> Liam Clarke
>
> On Wed, 16 Mar 2005 12:12:32 -0800, Mike Hall
> <michael.hall at critterpixstudios.com> wrote:
>> I'm having trouble getting re to stop matching after it's consumed 
>> what
>> I want it to.  Using this string as an example, the goal is to match
>> "CAPS":
>>
>>>>> s = "only the word in CAPS should be matched"
>>
>> So let's say I want to specify when to begin my pattern by using a
>> lookbehind:
>>
>>>>> x = re.compile(r"(?<=\bin)") #this will simply match the spot in
>> front of "in"
>>
>> So that's straight forward, but let's say I don't want to use a
>> lookahead to specify the end of my pattern, I simply want it to stop
>> after it has combed over the word following "in". I would expect this
>> to work, but it doesn't:
>>
>>>>> x=re.compile(r"(?<=\bin).+\b") #this will consume everything past
>> "in" all the way to the end of the string
>>
>> In the above example I would think that the word boundary flag "\b"
>> would indicate a stopping point. Is ".+\b" not saying, "keep matching
>> characters until a word boundary has been reached"?
>>
>> Even stranger are the results I get from:
>>
>>>>> x=re.compile(r"(?<=\bin).+\s") #keep matching characters until a
>> whitespace has been reached(?)
>>>>> r = x.sub("!@!", s)
>>>>> print r
>> only the word in!@!matched
>>
>> For some reason there it's decided to consume three words instead of
>> one.
>>
>> My question is simply this:  after specifying a start point,  how do I
>> make a match stop after it has found one word, and one word only? As
>> always, all help is appreciated.
>>
>>
>> _______________________________________________
>> Tutor maillist  -  Tutor at python.org
>> http://mail.python.org/mailman/listinfo/tutor
>>
>>
>>
>
>
> -- 
> 'There is only one basic human right, and that is to do as you damn 
> well please.
> And with it comes the only basic human duty, to take the consequences.
>



More information about the Tutor mailing list