[Spambayes] suspect spam that isn't spam
Tim Devick
TDevick at advsol.com
Tue Jan 13 12:50:55 EST 2004
I receive regular email messages from a particular source that are not spam, but always end up being classified as "suspect". Every time I get one of those, I click on the "Recover From Spam" button. Problem is, the next time I get an email from these individuals, they just end up in my "suspect" folder again. It doesn't seem like SpamBayes is learning that these particular emails are always "good". It filters the real junk fine - all the Paris Hilton videos, the viagra offers, etc. end up in my junk email folder as expected.
Here is one example of a "junk suspect" email that I received today. I have done "recover from spam" on emails from this user numerous times since I have had SpamBayes, but his emails keep ending up in my "suspect" folder. The spam clues for this email shows a spam score of 2%. In the SpamBayes manager, on the Filtering tab, it shows "Certain Spam" set to 90.0, and "Possible spam" set to 15.0. Can anyone explain why SpamBayes has classified this email as possible spam, with a spam score of only 2%?
Spam Score: 2% (0.0154048)
word spamprob #ham #spam
'*H*' 0.993503 - -
'*S*' 0.024313 - -
'drop' 0.00602871 85 0
'client' 0.0110273 46 0
'to:name:tim devick' 0.0136364 37 0
'from:addr:aptissoftware.com' 0.0172683 29 0
'from:addr:jesse.barrera' 0.0172683 29 0
'from:name:jesse barrera' 0.0172683 29 0
'wanted' 0.0207168 24 0
'that.' 0.023537 21 0
'message-id:@txexchange.aptis.net' 0.0304445 16 0
'all,' 0.0342957 79 1
'bring' 0.0516763 9 0
'spent' 0.0516763 9 0
'wonder' 0.0516763 9 0
'checking' 0.0573943 8 0
'account' 0.0645352 7 0
'to:addr:tdevick' 0.084505 105 4
'employee' 0.0859137 5 0
"i'll" 0.0901009 28 1
'(after' 0.102969 4 0
'bet' 0.102969 4 0
'discussions' 0.102969 4 0
'night,' 0.102969 4 0
'touch' 0.102969 4 0
'walked' 0.102969 4 0
'win' 0.102969 4 0
'asked' 0.107203 23 1
'"what' 0.128473 3 0
'looked' 0.138822 17 1
'probably' 0.169148 36 3
'"i' 0.17077 2 0
'arrived' 0.17077 2 0
'closer' 0.17077 2 0
'confirmed' 0.17077 2 0
'involved,' 0.17077 2 0
'laugh' 0.17077 2 0
'url:sweepstakes' 0.17077 2 0
'tomorrow' 0.196895 11 1
'did' 0.21306 61 7
'subject:: ' 0.233902 273 36
'had' 0.241606 59 8
'until' 0.244312 51 7
'way' 0.244724 58 8
'noticed' 0.248971 8 1
'were' 0.250828 63 9
'enter' 0.252343 42 6
'"signing' 0.254588 1 0
'$25,000' 0.254588 1 0
'"no' 0.254588 1 0
'"what' 0.254588 1 0
'asked,' 0.254588 1 0
'bonus"' 0.254588 1 0
'hotjobs:' 0.254588 1 0
'impossible' 0.254588 1 0
'luck.' 0.254588 1 0
'president.' 0.254588 1 0
'square.' 0.254588 1 0
'surprised' 0.254588 1 0
'url:hotjobs' 0.254588 1 0
'url:signingbonus' 0.254588 1 0
'yahoo!' 0.254588 1 0
'yahoo!?' 0.254588 1 0
'reply-to:none' 0.257807 512 77
'skip:& 20' 0.265966 20 3
'message' 0.275112 177 29
'x-mailer:none' 0.276623 459 76
'head' 0.302269 6 1
'mind' 0.302269 6 1
'should' 0.307601 120 23
'amount' 0.308104 11 2
'then' 0.31211 87 17
'there' 0.317959 144 29
'them' 0.321871 54 11
'next' 0.323684 68 14
'open' 0.329875 38 8
'morning' 0.338656 14 3
'skip:& 10' 0.354548 34 8
'told' 0.358875 21 5
'talking' 0.372601 12 3
'doing' 0.372891 43 11
'was' 0.373746 151 39
'before.' 0.375829 8 2
'said,' 0.375829 8 2
'into' 0.376698 69 18
'consider' 0.3846 4 1
'lucky' 0.3846 4 1
'after' 0.384674 63 17
'that' 0.388769 312 86
'before' 0.391648 54 15
'said' 0.39891 35 10
'took' 0.602327 12 8
'happy' 0.615686 7 5
'your' 0.616077 188 131
'lot' 0.618837 14 10
'little' 0.638124 22 17
'started' 0.648662 11 9
'"no' 0.650288 1 1
'became' 0.650288 1 1
'curious' 0.650288 1 1
'turning' 0.650288 1 1
'over' 0.664362 44 38
'joke' 0.670673 2 2
'much' 0.671925 29 26
'front' 0.678755 3 3
'save' 0.684215 21 20
'many' 0.684828 22 21
'president' 0.685794 5 5
'long' 0.694073 18 18
'her' 0.694679 22 22
'skip:p 10' 0.702423 29 30
'better' 0.706218 16 17
'given' 0.734611 3 4
'full' 0.740656 15 19
'office' 0.743487 10 13
'office.' 0.746949 2 3
'skip:w 10' 0.775034 9 14
'placed' 0.783364 6 10
'never' 0.786676 14 23
'proto:http' 0.7922 126 209
'bank' 0.794537 2 4
'canada' 0.794537 2 4
'lose' 0.794537 2 4
'to:addr:advsol.com' 0.810946 113 211
'header:Date:1' 0.818663 115 226
'header:From:1' 0.818663 115 226
'header:Return-Path:1' 0.81995 114 226
'header:MIME-Version:1' 0.822692 93 188
'url:com' 0.824882 50 103
'himself' 0.830376 1 3
'money.' 0.830376 1 3
'subject:The' 0.830376 1 3
'header:Message-ID:1' 0.83038 90 192
'purse' 0.844828 0 1
'savings' 0.844828 0 1
'url:pa' 0.844828 0 1
'header:Received:5' 0.864936 55 154
'money' 0.893362 10 38
'desk' 0.908163 0 2
'wall.' 0.908163 0 2
'url:yahoo' 0.933347 1 9
'mirror' 0.934783 0 3
'positive' 0.949438 0 4
'url:*http' 0.965116 0 6
'url:us' 0.966868 1 19
'woman' 0.969799 0 7
'url:rd' 0.973373 0 8
'url:' 0.976786 3 64
'100%' 0.990798 0 24
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/spambayes/attachments/20040113/073fd2bd/attachment-0001.html
More information about the Spambayes
mailing list