[Spambayes] Mixed case words in heading

bill parducci bill at parducci.net
Fri Apr 11 10:07:22 EDT 2003


spambayes doesn't 'whitelist'. the idea is to create a profile through 
training. admittedly, it may take a while for some messages to be be 
trained properly (i personally have problems with state department 
notices because they have so many words similar to mail scams) but it 
will occur in the vast majority of cases with time (as long as you 
continue to retrain! :o)

there has been some discussions about training on various components of 
each message and using various techniques to combine those scores. i am 
not sure what the current status is, but i expect that, "[someone] build 
it. test it. show it." would be a good guess :o)

as to the 'mixed case' issue, i believe that there have been a couple of 
different tests looking at case, etc., none of which returned 
statistical relevance. therefore, i *think* that the scoring is case 
insensitive currently (i would assume to  optimize db size).

b

Jan Fure wrote:
> Hi;
> 
> I have noticed that many spam messages contain words with mixed case. I 
> am wondering whether spambayes has any provision for increasing the spam 
> rating when that occurs, even if that particular mixed case word has not 
> been encountered before?
> 
> Also, are there any provisions for creating a whitelist, or is that 
> typically not necessary, as the filter algorithm is effective enough?
> 
> Jan
> 
> 
> _______________________________________________
> Spambayes mailing list
> Spambayes at python.org
> http://mail.python.org/mailman/listinfo/spambayes





More information about the Spambayes mailing list