Spam filter
Mikkel Rasmussen
footech at get2net.dk
Tue Apr 17 20:21:12 EDT 2001
Laura Creighton <lac at cd.chalmers.se> wrote in message
news:mailman.987524243.29270.python-list at python.org...
> You realise, Brian, that if you ever get your program working, your
> original message would be filtered out. After all your message contains:
>
> >keywords in body, like sex, viagra, penis, money, income, earn, free,
free, free
>
> If you make your program freely available, I see severe problems in
> mailing you bug reports, as well.
>
> Laura Creighton
>
:-)
Based on statistical and linguistic evidence (generated by Python code) the
following keywords works fine for filtering the spam I receive:
"sexual", "gambling" and "$".
That takes care of almost all spam without touching useful messages. It is
still being developed though and gets better all the time.
The big problem is how to filter *only* spam. The easy way out would be to
remove all non-Danish messages, since I have never received spam in Danish,
but that has too many unwanted side-effects :-)
Mikkel Rasmussen
More information about the Python-list
mailing list