[spambayes-dev] Another incremental training idea...
tim.one at comcast.net
Mon Jan 19 18:26:54 EST 2004
> As I understood the article, the Outlook 2003 filter is actually a
> very well-trained Bayesian-type filter, and the MSN and Hotmail
> message flow almost certainly provided the data for that.
Yes, I'm sure they did. Spam on Hotmail went way down shortly before OL2003
was shipped -- and shortly thereafter went right back up again. I expect
they were testing the static feature weights shipped with OL2003, and
spammers quickly learned to out-wit those.
> The problem with the Outlook filter is that it isn't user-trainable.
> I wonder if Microsoft decided not to include that to avoid the
> accuracy problems that we often see reported when users mis-train
> the filter. Given their average user base, any accuracy issues
> would no doubt be blamed on Microsoft and not user error.
MS accepts blame very well <wink>: there's nobody at MS you can talk to
about a complaint, unless you pay for the privilege, and vendors
incorporating MS stuff as OEM code are stuck with support themselves. I
more expect that developing the UI code for OL was intractable in the time
they had. I don't think MS would ship something as complex as, e.g., the
SpamBayes UI, in a consumer product, and UI code is darned hard regardless.
Or it could indeed be that real-life feedback from MSN 8 wasn't as good as
> Rumor has it that the MSN Explorer mail reading interface for MSN
> dialup accounts does, in fact, support user training. Can you
> confirm or deny?
Not personally -- I never installed MSN 8. I retain that dialup account
simply because it gets me a solid nationwide phone network for dialup access
when I'm on the road. It's the "ISP" part I pay them for, not the "MSN"
part. According to this:
it *does* do user-driven learning, but that's the most detailed account I've
seen, and it doesn't really reveal anything.
More information about the spambayes-dev