[Spambayes] Interesting behaviour from the Outlook client

Moore, Paul Paul.Moore at atosorigin.com
Wed Dec 4 09:01:00 2002


Over the past few days, I've been seeing an increase in FNs and Unsures. I initially trained on my inbox and spam folders (386 ham, 999 spam), and since then I've trained on errors only. I'm now at 391 ham and 1011 spam. Initially, I was getting no errors, and 1 or 2 unsures per day. Now, I'm starting to get at least 1 FN per day, and a slight increase in the unsure rate.

It's far too early to tell, but could this be related to Tim's code to handle unbalanced training sets? As time goes on, the spam:ham ratio will increase (as FNs happen more often than FPs) and so the impact of spam clues will be lessened (by Tim's code). I'll keep monitoring this, but my "real life" mail is definitely unbalanced (home is massively biased in favour of spam, work massively biased in favour of ham, but I pre-filter mailing lists which muddies the water badly).

I dunno. Do the testing gurus round here have any idea whether this type of hypothesis could be tested in practice?

Paul.



More information about the Spambayes mailing list