[spambayes-dev] imbalance within ham or spam training sets?
T. Alexander Popiel
popiel at wolfskeep.com
Tue Nov 4 17:41:47 EST 2003
In message: <20031103220757.323B72DF59 at cashew.wolfskeep.com>
"T. Alexander Popiel" <popiel at wolfskeep.com> writes:
>Perhaps it's time to test a variation where the prob is based on
>hamcount and spamcount instead of hamratio and spamratio. Hrm.
>*tap, tap, tap* I'll be back in a few hours...
FWIW, basing the prob on the raw counts instead of the ratios is
an incredibly clearcut loss. Only won twice on the false positives
(by relatively small margins), but lost EVERY time on the false
negatives by large amounts.
More information about the spambayes-dev