[spambayes-dev] Generating a sample training database

Tim Peters tim.one at comcast.net
Wed Sep 17 22:35:36 EDT 2003


[Skip]
>> I'll take the lead in grabbing the ham and spam and putting
>> together a sample training database (pickle format seems
>> easiest).  If you'd like to contribute (no more than two ham
>> and two spam per person please), forward such messages to me

[Tony Meyer]
> Is there any particular sort of message that we should contribute?
> Something extremely hammy/spammy?  Something that we think is really
> generic?  Or just any random message we click on?

I suggest only msgs that score 1.00 (rounded) and 0.00 (rounded) when
originally received (not after training on them) -- we're trying to catch a
good deal of blatant spam with a starter database, and can't fine-tune
anyway.  You probably don't want to forward ham containing personal details.




More information about the spambayes-dev mailing list