[spambayes-dev] Another piece of anecdotal evidence
Eli Stevens (WG.c)
listsub at wickedgrey.com
Wed Jan 14 14:17:55 EST 2004
Skip Montanaro wrote:
> Alex> Total: 4694 ham, 39913 spam (89.48% spam)
> Alex> Trained: 204 ham, 10994 spam (98.18% spam)
>
> Alex> Having such a high imbalance does seem to make me particularly
> Alex> susceptible to training errors... but doesn't seem to hurt
> Alex> otherwise.
Does it hurt more when a FP or FN is mistrained?
> How do you plan to find those mistrained messages?
Hmm... How feasible is:
trainEverything()
for msg in hamCorpus:
untrain( msg )
result = classify( msg )
if result == spam:
display( msg )
train( msg )
This won't work if the mistrained messages are not very spammy, but in
that case they shouldn't be affecting classification adversely, right?
Eli
More information about the spambayes-dev
mailing list