[Spambayes] Question about the ratio of Spam to Ham you shouldtrain on...

Furrh, Andrew afurrh at bumail.bradley.edu
Mon Oct 4 23:12:18 CEST 2004


Thanks for the info!  I'm going to read through that.

-----Original Message-----
From: spambayes-bounces at python.org [mailto:spambayes-bounces at python.org]
On Behalf Of Richie Hindle
Sent: Sunday, October 03, 2004 7:19 AM
To: spambayes at python.org
Subject: Re: [Spambayes] Question about the ratio of Spam to Ham you
shouldtrain on...


[Andrew]
> I know you're supposed to train Spambayes on a roughly equal
> amount of Spam and Ham.  Does that mean you should try to train on one
> new Ham for every Spam you train, even if all your Ham is already
being
> correctly identified by Spambayes?
> I get VASTLY more Spam than good mail, and in the last month of using
> Spambayes I've ended up training on over 200 spams, and only 33 hams.

[Graham]
> I'm in a similar position, and would be really interested in the
> opinions of the developers. I tend to train on my (already correctly
> classified) ham, just to try and keep the numbers even.

I personally try to keep the numbers even, by training on
correctly-classified ham.  The fact that it's already correctly
classified
doesn't mean that training on it is no use - it's still worth doing.

There's been a lot written on the wiki about training strategies - start
at http://www.entrian.com/sbwiki/TrainingIdeas

-- 
Richie Hindle
richie at entrian.com

_______________________________________________
Spambayes at python.org
http://mail.python.org/mailman/listinfo/spambayes
Check the FAQ before asking: http://spambayes.sf.net/faq.html



More information about the Spambayes mailing list