[spambayes-dev] Whitelists (was: A spectacular false positive)

Skip Montanaro skip at pobox.com
Sun Nov 16 22:35:16 EST 2003


    Richie>  o Whenever a message is trained as spam, remove the From
    Richie>    address from the whitelist.

So when a spammer forges Barry Warsaw's address (as I've seen before), Barry
disappears from my whitelist?

Most of my email that comes in as ham is never a candidate for training.
Even if I fed my current ham training database to a whitelist generator it
wouldn't whitelist a single '@python.org' address.  It would get a number of
Python-related email addresses though:

    gerrit at nl.linux.org
    tim_one at users.sourceforge.net
    amk at amk.ca
    anthony at interlink.com.au
    ...

While all these addresses are certainly valuable contacts in the Python
world, they are hardly representative of the email addresses which would
float to the front of my cortex if I decided to build a whitelist manually.
They just happen to be authors of Python-related messages on which I've
trained.  My current set of ham includes 11 python-list messages, two
python-checkins messages, and one each from the spambayes, mailman and
python-dev lists.  My Python mailbox obviously contains a lot more mail, but
it includes messages from random people asking Python questions which I
simply forgot to delete as well as messages I've saved for their content,
not necessarily who they are from.

    Richie>  o Whenever a message is received from a whitelisted addresses,
    Richie>    and scores as solid (for some value of 'solid') ham,
    Richie>    auto-train the message as ham.  You'd use this for personal
    Richie>    acquaintances only, and not for mailing lists or
    Richie>    organisations (amazon.com, ebay.com, etc.)

Now we're back to growing large databases.  I think over time you might wind
up with a highly unbalanced set of ham and spam.

Of course, as Tim pointed out, we all seem to be flying more-or-less by the
seat of our pants vis a vis training, so one feature is as good as another.
Still, I get so few false positives that I find it hard to believe a
whitelist - even if it included my wife and my boss - would be helpful.

    Richie> The upshot: I still don't trust SpamBayes to delete my Spam
    Richie> without looking it.

I have auto-deleted spam with a classifation of "spam; 1.00" for a couple
months.  My boss hasn't fired me yet for not responding to an email.

Skip



More information about the spambayes-dev mailing list