[Spambayes] training suggestions
Dhaval Patel
dhaval at patel.sh
Thu Aug 3 22:52:58 CEST 2006
> My fault. Yes, if you want to do incremental training with sb_mboxtrain,
> then leave off the -f flag. The fact that you are training once a day threw
> me though. If incremental training is what you want to do, why not run
> sb_mboxtrain more frequently than daily?
I just trained manually and realize that it says trained 190 out of 190 messages even
though there are only ~20 new messages since the last training. Is the output wrong?
My problem with running it at all is misclassification. How do I get it to fix the ones
that are misclassified?
>
> Dhaval> Now that I think about it, the best thing would be to do the
> Dhaval> following:
>
> Dhaval> 1. message comes in to the filter
> Dhaval> 2. filter sorts it as ham or spam
> Dhaval> 3. db is trained with this message ham or spam (just this one
> Dhaval> message) when the user sorts messages (put spam from inbox ->
> Dhaval> spam folder)
> Dhaval> 4. the changes to the db made in step 3 get undone and the
> Dhaval> message gets trained as spam (similarly if the user moved
> Dhaval> form spam folder -> inbox)
>
> Dhaval> Can spambayes do this? Can I specify just a message id or a list
> Dhaval> of message ids? ( I use maildir format mail storage)
>
> I'm not sure. Again, that's not the way I work. Nothing is ever trained
> automatically in my personal environment. If I see a message that is
> misclassified (either false negative, unsure or false positive) then I toss
> it into the appropriate training database. I never train on a message which
> was properly classified.
According to your last sentence, when I leave out the -f, I would also not train the
message that was properly classified right? Or maybe not. How would/should incermental
training be handled if the previous training occured with misclassifications?
>
> Note also that I have the luxury of having a user population of one person.
> Sounds like you aren't so fortunate.
My users are pretty good though. They dont have annoying questions or problems. :)
Thanks,
Dhaval
--
More information about the SpamBayes
mailing list