[Spambayes] Question about the ratio of Spam to Ham you
shouldtrain on...
Furrh, Andrew
afurrh at bumail.bradley.edu
Mon Oct 4 23:12:18 CEST 2004
Thanks for the info! I'm going to read through that.
-----Original Message-----
From: spambayes-bounces at python.org [mailto:spambayes-bounces at python.org]
On Behalf Of Richie Hindle
Sent: Sunday, October 03, 2004 7:19 AM
To: spambayes at python.org
Subject: Re: [Spambayes] Question about the ratio of Spam to Ham you
shouldtrain on...
[Andrew]
> I know you're supposed to train Spambayes on a roughly equal
> amount of Spam and Ham. Does that mean you should try to train on one
> new Ham for every Spam you train, even if all your Ham is already
being
> correctly identified by Spambayes?
> I get VASTLY more Spam than good mail, and in the last month of using
> Spambayes I've ended up training on over 200 spams, and only 33 hams.
[Graham]
> I'm in a similar position, and would be really interested in the
> opinions of the developers. I tend to train on my (already correctly
> classified) ham, just to try and keep the numbers even.
I personally try to keep the numbers even, by training on
correctly-classified ham. The fact that it's already correctly
classified
doesn't mean that training on it is no use - it's still worth doing.
There's been a lot written on the wiki about training strategies - start
at http://www.entrian.com/sbwiki/TrainingIdeas
--
Richie Hindle
richie at entrian.com
_______________________________________________
Spambayes at python.org
http://mail.python.org/mailman/listinfo/spambayes
Check the FAQ before asking: http://spambayes.sf.net/faq.html
More information about the Spambayes
mailing list