[Spambayes] Outlook plugin - training
Wed Nov 6 19:27:41 2002
[Moore, Paul, on the Outlook2K client]
> Actually, I'm not sure I like "Potential Spam" being treated as spam
> until confirmed as OK.
It doesn't. "Potential Spam" really means "unsure" -- it would be as
accurate to call it "Potential Ham", but neither is as accurate as Unsure.
The system knows it doesn't know what to call msgs in this category, and the
client doesn't automatically train on Unsure msgs (unless you *manually*
drag one into your Spam folder, or into one of your Ham folders).
> I have Rules Wizard rules which sort E-Mail traffic out into folders.
> I'm entirely happy with the behavious I understand to be the case - rules
> processed before the plugin - as I don't get spam on list addresses, so
> I'm OK with list traffic being totally excluded from the spam process.
The Define Filters dialog has a multi-selection folder control, so you can
tell the client to watch any number of folders (you're not limited to the
Inbox alone; add the destination folders of your other Outlook rules if you
want email coming into those watched too).
The interaction with Outlook's Rules Wizard (RW) remains unclear. The RW's
internal workings appear undocumented, and there appears no way to hook into
it. I've definitely seen the addin's filtering rules trigger *while* the RW
was still running, and in some cases that can lead to the addin's filtering
looking at a msg more than once. For example, the addin's filter may
trigger when a msg first arrives in the Inbox, and then a second time on the
same msg when the RW moves it into a different folder that the addin's
filter is also watching. In this case the client suffers an internal
exception, as the entry ID Outlook told it to use for the first trigger gets
invalidated by the move. It works OK in the end, but "something isn't quite
right" about it.
> But I've had a couple of list messages end up in "Potential Spam".
> Either the rules wizard is missing them (possible, I never had much
> confidence in it :-() or the plugin is interfering somehow.
Sorry, can't say without a concrete example to stare at. I haven't seen tha
addin make any mistakes here, although it's common to get baffled about
exactly why the RW does what it does.
> I think I may switch off the "potential spam" bit, and just filter
> out known spam, and classify my Inbox by hand. I'll leave it a bit
> longer before deciding, though.
You'll be happier if you keep an Unsure folder. For me, about 1% of my
email ends up there, about half-and-half ham vs spam, and my Inbox is
virtually spam-free (while my Spam folder is pure spam now -- about 100 per
Another: Note that this is pre-alpha software, and you should definitely
keep persistent Ham and Spam folders for training, as updating the code may
invalidate your database(s), or introduce tokenization and/or scoring and/or
configuration changes that render your database(s) worse than useless. IOW,
you should stay prepared to retrain from scratch. I set up a distinct .pst
file to hold Ham and Spam examples for this purpose, to keep from cluttering
my primary msg store. The folder controls in the addin (unlike several in
Outlook itself!) allow selecting multiple folders from multiple msg stores
too, and my Spam folder is actually in this other .pst file.
More information about the Spambayes