[Spambayes] Spam Clues ????? ??????

skip at pobox.com skip at pobox.com
Wed Apr 16 22:00:08 CEST 2008


David,

Your company's virus scanner seems to be contributing a lot of weight to the
miscalculation.  I'm going to guess this text gets appended to every
virus-free message which arrives at your server:

    No virus found in this incoming message.
    Checked by AVG.
    Version: 7.5.524 / Virus Database: 269.23.0/1379 - Release Date: 15/04/2008 18:10

Since most mails you receive probably don't contain viruses, so you see
those tokens frequently:

    token                      spamprob         #ham  #spam
    'message.'                 0.310872           15     13
    'date:'                    0.325631           14     13
    'checked'                  0.341867           13     13
    'database:'                0.341867           13     13
    'incoming'                 0.341867           13     13
    'version:'                 0.341867           13     13
    'virus'                    0.35698            14     15
    'release'                  0.358294           13     14
    'avg.'                     0.359817           12     13
    'found'                    0.385564           14     17

Those are not terribly strong ham signals but there are a lot of them so
they skew the final score.  The only thing I can suggest is to keep marking
spam as such.  Eventually, the score for those tokens will fall into the
no-man's land of 0.4 to 0.6 and they will be ignored.

-- 
Skip Montanaro - skip at pobox.com - http://www.webfast.com/~skip/


More information about the SpamBayes mailing list