[Tutor] Why doesn't this regex match???

Tim Peters tim.one@comcast.net
Sat, 09 Feb 2002 16:07:58 -0500


[Sheila King]
> Actually, what we'd been doing was more like:
>
>         s = subject.lower()
>  		s = ' ' + s + ' '
>          isjunk = 0
>          for phrase in spamphrases:
>              if s.find(' ' + phrase + ' ') >= 0:
>                  isjunk = 1
>                  break
>
> We wanted to match, say, "sex" but not "sextant". (As the joke is going
> in our discussion group...SO MANY spams that we receive have the word
> "sextant" in the subject line!!! )

If you're going to run this code a lot, you want to do as little work as
possible in the inner loop.  So you would want to stick a blank on each side
of the phrases once and for all.  Regexps are certainly more flexible here.
The question is whether it's less work in the end to craft a regexp that
works than to pick up a spam filter from someone else <wink>.