[Spambayes] overtraining and retraining

Jesus Cea jcea at jcea.es
Sun Oct 16 18:13:51 CEST 2011


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 16/10/11 17:47, Jesus Cea wrote:
> On 16/10/11 17:30, Jesus Cea wrote:
>> 2. When I train over a message, I keep training in a loop until 
>> the message probability goes under 20% (ham) or over 90% (spam).
>> As the database ages, training spam needs more "looping", that
>> is, the probability goes up slowly. The ham training,
>> nevertheless, is fast and the loop counting is low.
> 
> Uhm, the wiki says: "never train the same message twice". Reason?.
> I am breaking this badly.

Maybe relevant to "no train twice on the same email" and the ham/spam
counts imbalance:
<http://www.garyrobinson.net/2004/02/spam_filtering_.html>

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at jcea.es - http://www.jcea.es/     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at jabber.org         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBTpsCv5lgi5GaxT1NAQJowAP/VFphe8V/y+SXViQIe/LOP5lGaeZS5MUC
oiD+bkqrhDud5q+del8YSb8EYqnTnhzq+EKrxaNmMc5QqnqSEJYPWD4Z9BZOlumi
tCRhJ8UBCrFOSl4zLPmQFns+ZHDBxQhTHsuG5CpvZ7ZDjPE30vSdvyAbuK0ZqQ4C
v6QsWSquPoc=
=N8Nm
-----END PGP SIGNATURE-----


More information about the SpamBayes mailing list