[Spambayes] Low-priority feature request

John Byrd jbyrd at well.com
Sun May 30 13:25:27 EDT 2004


> I believe the database would have to save additional info to have any hope
> of giving a meaningful report about "most indicative of spam".  These
aren't
> necessarily the tokens with the highest spamprobs!  For most people, who
do
> some form of mistake-based training, the tokens with the highest spamprobs
> are merely those that got *trained* on most often.

> What this shows is what I already knew <wink>:  most of the Unsures I
train
> on as spam are autoreply or bounce kinds of messages, due to virus and
spam
> email forged to appear as if it came from one of the public admin and help
> addresses I volunteer for, or from one of my personal addresses.  I get a
> ton of these, and they're spam to me.

A fascinating analysis, Tim... and it does suggest to me that there is
useful
information in the spam database... but a human will always have to make
sense of the top and bottom 10% spamprobs.  In my case, I'm receiving a huge
quantity of spam on "mail.well.com".  I'm guessing that spammers, at one
time or another, have tried every legal e-mail address on this server in
order to find mine.  Same thing on an old college e-mail forwarding account,
which I never use.

So SpamBayes helped me find this source of spam, which is highly specific to
my personal e-mail situation.

jwb





More information about the Spambayes mailing list