[Spambayes] training WAS: aging information
T. Alexander Popiel
popiel at wolfskeep.com
Wed Feb 19 15:28:30 EST 2003
In message: <1ED4ECF91CDED24C8D012BCF2B034F1318CD64 at its-xchg4.massey.ac.nz>
"Meyer, Tony" <T.A.Meyer at massey.ac.nz> writes:
>[Alex]
>> I was the one who did the bulk of the ratio experiments, and I
>> posted my results at http://www.wolfskeep.com/~popiel/spambayes.
>>
>> It would be worthwhile to rerun similar experiments with current
>> versions of the code, too.
>
>Thanks for (re)posting this link, certainly interesting reading. Were
>these done before or after the experimental_ham_spam_in_balance code?
Before; I like to think that my results were in part responsible for
getting that option added.
>What I would like to know (and I suspect others) is whether this means
>that say I have in my stored mail a ham:spam ratio of 300:3000. Should
>I randomly chose 300 ham and have a 300:300 ratio? Or is giving up the
>information in the other 2700 messages a bad thing?
Well, as long as the 300 ham chosen are actually representative of
the types of ham you get, I don't see any harm in only using 300.
I don't have the math or the experimental results to back that up,
though.
>If someone was willing to do some more tests with the most recent code,
>I think lots of people would be interested.
I'm trying to, but life keeps interfering.
- Alex
More information about the Spambayes
mailing list