[Spambayes] Training on unusual ham - revisited

bob posert bposert at yahoo.com
Fri Feb 3 05:34:52 CET 2006


Back in http://mail.python.org/pipermail/spambayes/2006-January/018702.html , Tim Peters and I had a dialog about training on unusual ham - monthly messages from http://www.boldtype.com.  I just got another one and it scored 50% on the spam scale.  The clues follow - I'd really appreciate any help.
 Thanks,
 Bob

 Combined Score: 50% (0.5) Internal ham score (*H*):  1
 Internal spam score (*S*): 1
 
 # ham trained on: 1229
 #  spam trained on: 20331
  150 Significant Tokens token                               spamprob         #ham  #spam
'policy:'                           0.00858658         32      1
'url:dots'                          0.0115681          19      0
'url:books'                         0.0121951          18      0
'unsubscribe:'                      0.0129932          56      9
'url:gif)'                          0.0136778          16      0
'url:covers'                        0.0180723          12      0
'respects'                          0.0186053          32      7
'url:store'                         0.019152           28      6
'url:ad'                            0.0192872          19      3
'criticism,'                        0.0196507          11      0
'pages:'                            0.0196507          11      0
'url:func'                          0.0196507          11      0
'url:grey_dot'                      0.0196507          11      0
'c/o'                               0.0206833          15      2
'list.'                             0.0207364          51     15
'list,'                             0.0211427          89     29
'10012'                             0.0215311          10      0
'1212'                              0.0215311          10      0
'594'                               0.0215311          10      0
'anjuli'                            0.0215311          10      0
'ayer'                              0.0215311          10      0
'boldtype'                          0.0215311          10      0
'boldtype,'                         0.0215311          10      0
'boldtype.com'                      0.0215311          10      0
'complies'                          0.0215311          10      0
'email addr:boldtype.com'           0.0215311          10      0
'email addr:boldtype.com.'          0.0215311          10      0
'email name:subscriptions'          0.0215311          10      0
'email-based'                       0.0215311          10      0
'flavorpill'                        0.0215311          10      0
'from:addr:boldtype.com'            0.0215311          10      0
'from:name:boldtype'                0.0215311          10      0
'glei'                              0.0215311          10      0
'graphical'                         0.0215311          10      0
'handpicked'                        0.0215311          10      0
'laster'                            0.0215311          10      0
'magazines:'                        0.0215311          10      0
'mangan'                            0.0215311          10      0
'monthly,'                          0.0215311          10      0
'npr'                               0.0215311          10      0
'publishers,'                       0.0215311          10      0
'sascha'                            0.0215311          10      0
'subject: | '                       0.0215311          10      0
'submissions'                       0.0215311          10      0
'url:108s877'                       0.0215311          10      0
'url:392388'                        0.0215311          10      0
'url:boldtype'                      0.0215311          10      0
'url:boldtype_com'                  0.0215311          10      0
'url:boldtype_logo'                 0.0215311          10      0
'url:border-pixel'                  0.0215311          10      0
'url:current'                       0.0215311          10      0
'url:designby'                      0.0215311          10      0
'url:earplug'                       0.0215311          10      0
'url:editor'                        0.0215311          10      0
'url:email_address'                 0.0215311          10      0
'url:email_subject'                 0.0215311          10      0
'url:flavorpill'                    0.0215311          10      0
'url:issues'                        0.0215311          10      0
'url:jcreport'                      0.0215311          10      0
'url:job_id'                        0.0215311          10      0
'url:listing_id'                    0.0215311          10      0
'url:mail_list_id'                  0.0215311          10      0
'url:other-pubs_culture'            0.0215311          10      0
'url:other-pubs_earplug'            0.0215311          10      0
'url:other-pubs_fashion'            0.0215311          10      0
'url:other-pubs_flavorpill'         0.0215311          10      0
'url:other-pubs_jcreport'           0.0215311          10      0
'url:other-pubs_music'              0.0215311          10      0
'url:partnerships'                  0.0215311          10      0
'url:spamlaws'                      0.0215311          10      0
'url:sublit'                        0.0215311          10      0
'url:subscriber_id'                 0.0215311          10      0
'url:}'                             0.0215311          10      0
'weissman'                          0.0215311          10      0
'nonfiction'                        0.0215731          12      1
'url:monthly'                       0.0215731          12      1
"insider's"                         0.0220053          14      2
'url:mail'                          0.0227833          41     13
'reviews,'                          0.0233401          11      1
'url:archive'                       0.0233401          11      1
'url:sub'                           0.0233401          11      1
'publisher:'                        0.0238095           9      0
'url:grey_separation2'              0.0238095           9      0
'url:zoom_in'                       0.0238095           9      0
'1950'                              0.97619             0      9
'abigail'                           0.97619             0      9
'clifford'                          0.97619             0      9
'handkerchief'                      0.97619             0      9
'image.'                            0.97619             0      9
'prospect,'                         0.97619             0      9
'rogue'                             0.97619             0      9
'belle'                             0.978469            0     10
'boyle'                             0.978469            0     10
'celluloid'                         0.978469            0     10
'errol'                             0.978469            0     10
'gritty'                            0.978469            0     10
'mann'                              0.978469            0     10
'overlooked'                        0.978469            0     10
'beethoven'                         0.980349            0     11
'cecil'                             0.980349            0     11
'crafty'                            0.980349            0     11
'examine'                           0.980349            0     11
'whims'                             0.980349            0     11
'1962'                              0.981928            0     12
'1983'                              0.981928            0     12
'alfred'                            0.981928            0     12
'dismissed'                         0.981928            0     12
'spanned'                           0.981928            0     12
'gentlemen,'                        0.983271            0     13
'jorge'                             0.983271            0     13
'parish'                            0.983271            0     13
'proximity'                         0.983271            0     13
'readily'                           0.983271            0     13
'chaplin'                           0.984429            0     14
'comparison.'                       0.984429            0     14
'dreamlike'                         0.984429            0     14
'explodes'                          0.984429            0     14
'stars,'                            0.984429            0     14
'otto'                              0.985437            0     15
'1985'                              0.987106            0     17
'atom'                              0.987106            0     17
'bourgeoisie'                       0.987106            0     17
'reject'                            0.987106            0     17
'sued'                              0.987106            0     17
'temporarily'                       0.987106            0     17
'imagined'                          0.987805            0     18
'praise'                            0.987805            0     18
'emperor'                           0.988432            0     19
'syndrome.'                         0.988432            0     19
'balfour'                           0.988998            0     20
'escaped'                           0.988998            0     20
'finalists'                         0.988998            0     20
'anxiety,'                          0.989978            0     22
'raging'                            0.989978            0     22
'tucked'                            0.989978            0     22
'apt'                               0.990798            0     24
'francis'                           0.990798            0     24
'face.'                             0.991159            0     25
'marie'                             0.992091            0     28
'respectful'                        0.99236             0     29
'dose'                              0.992611            0     30
'nineteen'                          0.993469            0     34
'disliked'                          0.994572            0     41
'awesome'                           0.99505             0     45
'ego'                               0.996397            0     62
'ladies'                            0.99676             0     69
'projection'                        0.997168            0     79
'broke'                             0.997512            0     90
'accordance'                        0.998921            0    208
'discreet'                          0.999019            0    229
 




More information about the SpamBayes mailing list