[spambayes-dev] Another incremental training idea...

Tim Peters tim.one at comcast.net
Mon Jan 19 18:26:54 EST 2004


[Kenny Pitt]
> As I understood the article, the Outlook 2003 filter is actually a
> very well-trained Bayesian-type filter, and the MSN and Hotmail
> message flow almost certainly provided the data for that.

Yes, I'm sure they did.  Spam on Hotmail went way down shortly before OL2003
was shipped -- and shortly thereafter went right back up again.  I expect
they were testing the static feature weights shipped with OL2003, and
spammers quickly learned to out-wit those.

> The problem with the Outlook filter is that it isn't user-trainable.

Yup.

> I wonder if Microsoft decided not to include that to avoid the
> accuracy problems that we often see reported when users mis-train
> the filter.  Given their average user base, any accuracy issues
> would no doubt be blamed on Microsoft and not user error.

MS accepts blame very well <wink>:  there's nobody at MS you can talk to
about a complaint, unless you pay for the privilege, and vendors
incorporating MS stuff as OEM code are stuck with support themselves.  I
more expect that developing the UI code for OL was intractable in the time
they had.  I don't think MS would ship something as complex as, e.g., the
SpamBayes UI, in a consumer product, and UI code is darned hard regardless.
Or it could indeed be that real-life feedback from MSN 8 wasn't as good as
anticipated.

> Rumor has it that the MSN Explorer mail reading interface for MSN
> dialup accounts does, in fact, support user training.  Can you
> confirm or deny?

Not personally -- I never installed MSN 8.  I retain that dialup account
simply because it gets me a solid nationwide phone network for dialup access
when I'm on the road.  It's the "ISP" part I pay them for, not the "MSN"
part.  According to this:

   http://research.microsoft.com/~joshuago/spamconferenceshort.ppt

it *does* do user-driven learning, but that's the most detailed account I've
seen, and it doesn't really reveal anything.




More information about the spambayes-dev mailing list