[Spambayes] Maintain training with Outlook?
tameyer at ihug.co.nz
Wed Feb 25 01:47:39 EST 2004
> The simple version first: What's the best way to maintain
> training with the Outlook plugin?
The simple answer first <wink>. There is no consensus on a 'best way'.
> The FAQ advises training
> on "a few ham and a few spam" on a regular basis, but it's
> not obvious how to train on ham that hasn't been
Simply training on all misclassified (including unsure, which isn't really a
misclassification) mail should give pretty good results. Almost certainly
better than training on everything, and probably close to as good as any
other training regime.
> The "Recover from Spam" button is only
> present in the Junk Mail and Junk Mail Suspects folders, and
> the wizard completely rebuilds the database; is there
> something less drastic that I'm missing?
Two options, if you do want to do more:
1. The "Training" tab of the Manager dialog has a button to train all
messages in a particular folder (or set of folders) as ham/spam ("Train
Now"). You can elect to rebuild the database from scratch, or just add to
2. You can enable "incremental training" (also on the "Training" tab).
This means that mail will be trained as ham when you move mail into a folder
you are watching (like the Inbox), and as spam when it's moved into the spam
folder. If mail is already in the (eg) Inbox, then you have to move it out
and then back in again.
> My spam/ham ratio before filtering is probably close to 10:1.
> How do I choose which and how many messages to train on?
The wiki (http://entrian.com/sbwiki) has a lot of details, as there really
isn't a consensus on the best method. Simply training on mistakes (false
positives, false negatives, and unsures) should give a good result, and will
probably keep the ratio reasonably balanced (if not, then just select some
of the mistakes - probably those with a score closest to 0.5).
> Now, for extra credit: I don't actually use Outlook much --
> I use Pocket Outlook on my PDA, and do as much as I can on
> the train. The plugin buttons obviously aren't available,
> and moving messages into the Junk folders doesn't work
> either. Apparently ActiveSync moves the messages behind the
> scenes and SpamBayes doesn't see them move. I made a couple
> of special folders, TrainJunk and TrainGood, to sort messages
> into when using Pocket Outlook; when I'm back on the desktop
> I "Delete as Spam" the entire TrainJunk folder. I haven't
> figured out what to do with the TrainGood folder yet. Does
> anyone see a less cumbersome way of handling this?
Not apart from getting Mark to spend time figuring out a way to notice those
moves, or getting Microsoft (or the PDA maker) to make the notices appear
correctly. (For the former, you could submit a feature request at
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes. This
way, you get everyone's help, and avoid a lack of replies when I'm busy.
More information about the Spambayes