[Tutor] regex: not start with FOO
Kent Johnson
kent37 at tds.net
Tue Feb 3 18:00:50 CET 2009
On Tue, Feb 3, 2009 at 11:12 AM, Bernard Rankin <berankin99 at yahoo.com> wrote:
> In [3]: re.findall('^(?!FOO)in', 'in in in')
> Out[3]: ['in']
>
> In [4]: re.findall('(?!^FOO)in', 'in in in')
> Out[4]: ['in', 'in', 'in']
>
> In [5]: re.findall('(?!FOO)in', 'in in in')
> Out[5]: ['in', 'in', 'in']
>
> In [6]: re.findall('(?!FOO$)in', 'in in in')
> Out[6]: ['in', 'in', 'in']
>
> In [7]: re.findall('(?!^FOO$)in', 'in in in')
> Out[7]: ['in', 'in', 'in']
>
>
> What is the effective difference between numbers 4 thru 7?
>
> That is, what effect does a string position anchor have within the sub expression?
6 & 7 are meaningless; you can never have an end-of-line ($) followed by text.
> Hmm...
>
> In [30]: re.findall('(?!FOO)in', 'in FOOin in')
> Out[30]: ['in', 'in', 'in']
OK. (?!...) is a look *ahead* assertion - it requires that the current
match not be followed by the given expression. It seems that this is
meaningless at the start of a regex, since there is no current match.
In other words, '(?!FOO)in' matches 'in FOOin in' at position 6
because starting at position 6 there is no FOO.
You should use a look-behind assertion, or just put the ^ outside the assertion.
In [2]: re.findall('(?!FOO)in', 'in FOOin in')
Out[2]: ['in', 'in', 'in']
In [3]: re.findall('(?<!FOO)in', 'in FOOin in')
Out[3]: ['in', 'in']
In [4]: re.findall('(?!^FOO)in', 'FOOin FOOin in')
Out[4]: ['in', 'in', 'in']
Kent
More information about the Tutor
mailing list