[Spambayes] full o' spaces
Tim Peters
tim_one at email.msn.com
Sun Mar 9 02:13:56 EST 2003
[Tim Peters]
> There haven't been enough test reports on that one to decide.
> It's True by default in the Outlook client, but still appears to be
> False by default everywhere else. There are bad visible effects
> either way (if it's off and you get a large ratio imbalance, it's too
> easy for a msg to score incorrectly as belonging to the more popular
> category; if it's on and you get a large ratio imbalance, training on
> another example from the more popular category has little effect,
> exacerbating (for example) the "but I trained on it and it's *still*
> called ham!" irritation).
[Tim Stone]
> Rats. I thought it was True by default.
It is if you're using the Outlook client.
> All this time I've been using it thinking it was on... ok, so if I
> turn it on now, what would I expect?
Did you read the paragraph you quoted? I've written several small essays on
the topic here, and think the parenthetical comments above are a decent
summary.
> I have a huge ham/spam imbalance in my notes sb database,
Striving for balance is likely a better idea.
> and have been a bit disappointed by the classifier...
I'm short on telepathy tonight. Perhaps the *way* in which you're
disappointed is related to the comments above? For example, if you have
much more ham than spam and have a too-high FN rate, or you have much more
spam than ham and have a too-high FP rate, then the comments are directly
applicable.
More information about the Spambayes
mailing list