[Image-SIG] SPAM/Filters not useless
Fri, 24 May 2002 21:26:39 +0200
> I read / post to about half a dozen mailing lists. All of them have =
> that leaks through. What leaks though though is usually a small =
> of what they get bombed with. The filtering used is not perfect, but =
> from useless.
The python.org mailing lists all use the SpamAssissin filter to
kill things that are obviously spam. We use the same filter at
pythonware.com, and judging from our logs, the filter correctly
identifies about 98-99% of all incoming spam. YMMV.
Quoting Greg Ward:
Here's how it works: all mail coming into mail.python.org is =
SpamAssassin, which performs several hundred tests on each message.
(Many are regex tests, others look at the MIME structure, others =
various DNS blacklists, etc.) Each test has a score associated with =
and each message is scored according to the sum of all tests that it
matches. Eg. if the message starts "Dear Friend" (or similar), it =
2 points; any mention of copying DVDs is worth 2.7 points, etc. If =
message scores 5 or over, it's considered spam and shunted off to a
special folder for review by one of the postmasters.
The tricky area is messages that don't quite score 5. Since
SpamAssassin doesn't consider them spam, they are sent on to their
(see spamassassin.org for more info on this filter)
I've modified the image-sig filter to move mails that score between
3 and 5 to a moderation queue. I cannot guarantee that it will catch
every future spam, but it might make it a bit better.
(one per month instead of one per week, perhaps?)