How make regex that means "contains regex#1 but NOT regex#2" ??

Paul McGuire ptmcg at austin.rr.com
Tue Jul 1 04:04:50 EDT 2008


On Jul 1, 2:34 am, "A.T.Hofkamp" <h... at se-162.se.wtb.tue.nl> wrote:
> On 2008-07-01, seber... at spawar.navy.mil <seber... at spawar.navy.mil> wrote:
>
> > I'm looking over the docs for the re module and can't find how to
> > "NOT" an entire regex.
>
> (?! R)
>
> > How make regex that means "contains regex#1 but NOT regex#2" ?
>
> (\1|(?!\2))
>
> should do what you want.
>
> Albert

I think the OP wants both A AND not B, not A OR not B.  If the OP want
to do re.match(A and not B), then I think this can be done as ((?!
\2)\1), but if he really wants CONTAINS A and not B, then I think this
requires 2 calls to re.search.  See test code below:

import re

def test(restr,instr):
    print "%s match %s? %s" %
(restr,instr,bool(re.match(restr,instr)))

a = "AAA"
b = "BBB"

aAndNotB = "(%s|(?!%s))" % (a,b)

test(aAndNotB,"AAA")
test(aAndNotB,"BBB")
test(aAndNotB,"AAABBB")
test(aAndNotB,"zAAA")
test(aAndNotB,"CCC")

aAndNotB = "((?!%s)%s)" % (b,a)

test(aAndNotB,"AAA")
test(aAndNotB,"BBB")
test(aAndNotB,"AAABBB")
test(aAndNotB,"zAAA")
test(aAndNotB,"CCC")

def test2(arestr,brestr,instr):
    print "%s contains %s but NOT %s? %s" % \
        (instr,arestr,brestr,
         bool(re.search(arestr,instr) and
              not re.search(brestr,instr)))

test2(a,b,"AAA")
test2(a,b,"BBB")
test2(a,b,"AAABBB")
test2(a,b,"zAAA")
test2(a,b,"CCC")

Prints:

(AAA|(?!BBB)) match AAA? True
(AAA|(?!BBB)) match BBB? False
(AAA|(?!BBB)) match AAABBB? True
(AAA|(?!BBB)) match zAAA? True
(AAA|(?!BBB)) match CCC? True
((?!BBB)AAA) match AAA? True
((?!BBB)AAA) match BBB? False
((?!BBB)AAA) match AAABBB? True
((?!BBB)AAA) match zAAA? False
((?!BBB)AAA) match CCC? False
AAA contains AAA but NOT BBB? True
BBB contains AAA but NOT BBB? False
AAABBB contains AAA but NOT BBB? False
zAAA contains AAA but NOT BBB? True
CCC contains AAA but NOT BBB? False


As we've all seen before, posters are not always the most precise when
describing whether they want match vs. search.  Given that the OP used
the word "contains", I read that to mean "search".  I'm not an RE pro
by any means, but I think the behavior that the OP wants is given in
the last 4 tests, and I don't know how to do that in a single RE.

-- Paul



More information about the Python-list mailing list