[Spambayes] Low-priority feature request
John Byrd
jbyrd at well.com
Sun May 30 13:25:27 EDT 2004
> I believe the database would have to save additional info to have any hope
> of giving a meaningful report about "most indicative of spam". These
aren't
> necessarily the tokens with the highest spamprobs! For most people, who
do
> some form of mistake-based training, the tokens with the highest spamprobs
> are merely those that got *trained* on most often.
> What this shows is what I already knew <wink>: most of the Unsures I
train
> on as spam are autoreply or bounce kinds of messages, due to virus and
spam
> email forged to appear as if it came from one of the public admin and help
> addresses I volunteer for, or from one of my personal addresses. I get a
> ton of these, and they're spam to me.
A fascinating analysis, Tim... and it does suggest to me that there is
useful
information in the spam database... but a human will always have to make
sense of the top and bottom 10% spamprobs. In my case, I'm receiving a huge
quantity of spam on "mail.well.com". I'm guessing that spammers, at one
time or another, have tried every legal e-mail address on this server in
order to find mine. Same thing on an old college e-mail forwarding account,
which I never use.
So SpamBayes helped me find this source of spam, which is highly specific to
my personal e-mail situation.
jwb
More information about the Spambayes
mailing list