python regex "negative lookahead assertions" problems
MRAB
python at mrabarnett.plus.com
Sun Nov 22 11:32:49 EST 2009
Tim Chase wrote:
>>>>> import re
>>>>> line='2009-11-22 12:15:441 lmqkjsfmlqshvquhsudfhqf qlsfh
>>>>> qsduidfhqlsiufh qlsiuf qldsfhqlsifhqlius dfh warning qlsfj lqshf
>>>>> lqsuhf lqksjfhqisudfh qiusdfhq iusfh'
>>>>> re.match('.*(?!warning)',line)
>> <_sre.SRE_Match object at 0xb75b1598>
>>
>> I would expect that this would NOT match as it's a negative lookahead
>> and warning is in the string.
>
> This first finds everything (".*") and then asserts that "warning"
> doesn't follow it, which is correct in your example. You may have to
> assert that "warning" doesn't exist at every point along the way:
>
> re.match(r'(?:(?!warning).)*',line)
>
> which will match up-to-but-not-including the "warning" text. If you
> don't want it at all, you'd have to also anchor the far end
>
> re.match(r'^(?:(?!warning).)*$',line)
>
> but in the 2nd case I'd just as soon invert the test:
>
> if 'warning' not in line:
> do_stuff()
>
The trick is to think what positive lookahead you'd need if you wanted
check whether 'warning' is present:
'(?=.*warning)'
and then negate it:
'(?!.*warning)'
giving you:
re.match(r'(?!.*warning)', line)
More information about the Python-list
mailing list