python regex "negative lookahead assertions" problems

MRAB python at mrabarnett.plus.com
Sun Nov 22 11:32:49 EST 2009


Tim Chase wrote:
>>>>> import re
>>>>> line='2009-11-22 12:15:441  lmqkjsfmlqshvquhsudfhqf qlsfh 
>>>>> qsduidfhqlsiufh qlsiuf qldsfhqlsifhqlius dfh warning qlsfj lqshf 
>>>>> lqsuhf lqksjfhqisudfh qiusdfhq iusfh'
>>>>> re.match('.*(?!warning)',line)
>> <_sre.SRE_Match object at 0xb75b1598>
>>
>> I would expect that this would NOT match as it's a negative lookahead 
>> and warning is in the string.
> 
> This first finds everything (".*") and then asserts that "warning" 
> doesn't follow it, which is correct in your example. You may have to 
> assert that "warning" doesn't exist at every point along the way:
> 
>   re.match(r'(?:(?!warning).)*',line)
> 
> which will match up-to-but-not-including the "warning" text.  If you 
> don't want it at all, you'd have to also anchor the far end
> 
>   re.match(r'^(?:(?!warning).)*$',line)
> 
> but in the 2nd case I'd just as soon invert the test:
> 
>   if 'warning' not in line:
>     do_stuff()
> 
The trick is to think what positive lookahead you'd need if you wanted
check whether 'warning' is present:

     '(?=.*warning)'

and then negate it:

     '(?!.*warning)'

giving you:

     re.match(r'(?!.*warning)', line)



More information about the Python-list mailing list